This project features a robust "Buffer & Sync" Orchestrator that ensures your AI’s memory is always accurate, deduplicated, and synchronized with your enterprise source of truth, utilizes a Google Sheets-backed job queue to ensure high availability and fault-tolerant syncing, even if the workflow is interrupted..
The Storage Intelligence: Beyond just "uploading files," this sub-workflow manages the entire data lifecycle. It automatically detects if a document already exists, performs an atomic "Delete-and-Replace" to prevent context clutter, and optimizes the Markdown structure for maximum retrieval performance in Open WebUI or Qdrant.
https://youtu.be/EGaVruqXSOM
Reliable State Management & Job Queueing (Google Sheets Integration):
Instead of fragile real-time triggers, I implemented a robust "Buffer & Sync" architecture using Google Sheets as a stateful job queue.
- Graceful Modification Handling: Prevents premature syncing by allowing a "Cool-down" period, ensuring documents are only processed after user edits are finalized.
- Fault-Tolerant Resumption: Unlike standard listeners that lose events upon workflow interruption, this approach tracks the sync status of every file. If a process fails, the system identifies "Pending" files and resumes automatically after workflow resumes, ensuring Zero Data Loss.
- Batch Processing Efficiency: Allows for controlled, scheduled sync intervals, preventing API rate-limiting issues during bulk uploads.
Sub-workflows :
High-Fidelity OCR & Parsing (Google Document AI Sub-workflow):
- Intelligent Layout Analysis: Uses Google’s specialized "Document AI" processor to identify and extract complex tables, headers, and hierarchical structures that standard Python libraries (like PyPDF2) often fail to read.
- Markdown Transformation: A post-processing logic that converts raw JSON output from Document AI into clean, semantic Markdown, specifically formatted to improve RAG retrieval and LLM context understanding.
- Asynchronous Processing: Handles large batches of documents efficiently, ensuring the n8n main workflow remains responsive during heavy data ingestions.

Secure Auth Layer (Custom JWT Generation Sub-workflow):
- GCP Service Account Integration: Implements a professional server-to-server authentication flow using Google Cloud Service Accounts.
- Custom JWT Signing: A dedicated sub-workflow that performs RSA-256 signing on JWT payloads to exchange for OAuth2 Bearer Tokens. This eliminates the need for manual browser-based authentication (OAuth Consent) and provides a fully headless, secure operation.
- Token Lifecycle Management: Automatically manages token expiration and refreshing, ensuring the sync engine runs 24/7 without intervention.
