How Google's File Search API Eliminated 7 Database Tables from My RAG Pipeline

I have a RAG pipeline for my digital signage platform. It’s used by our support bot, Chip. The architecture diagram looked impressive. GitHub repositories, Cloud Storage buckets, Vertex AI integrations, seven database tables tracking file hashes and sync states. It worked beautifully. Chips gets any updated articles from our knowledgebase every night. For a 1-person startup, I’m quite impressed with what I did there.

Then Google released File Search for the Gemini API last week, and I realized I coud immediately retire a ton of that as technical debt.

The Problem: RAG Pipeline Complexity

Let me show you what I built. My documentation sync system needed to keep help docs from two GitHub repositories available for AI-powered support queries. The workflow looked like this:

GitHub Repo → Fetch Markdown → Upload to GCS → Import to Vertex RAG → Query RAG

Here’s what that actually meant:

Cron job triggers daily at 2 AM UTC
DocSyncService checks GitHub for changed files (incremental sync)
Three-tier optimization checks: database hash → GCS metadata → full upload
Changed files uploaded to Cloud Storage with SHA256 hash metadata
Import files to Vertex AI RAG corpus (async polling, about 5 minutes)
RagService queries the corpus at runtime

The implementation sprawled across 1,240 lines of TypeScript, seven database tables, and three separate services. Every new feature meant coordinating changes across multiple components.

The funny part? It was working perfectly. The scheduled job had a 100% success rate processing 55-57 documentation files. Response times were great. Cost was minimal.

But complexity has a cost you don’t see on your cloud bill.

What is Google Gemini File Search?

File Search is a fully managed RAG system built directly into the Gemini API. You upload documents, and Google handles the chunking, embedding generation, vector indexing, and semantic search.

The key difference from the regular Files API is persistence. Standard file uploads disappear after 48 hours. Documents imported into a File Search store stay indefinitely until you delete them.

From the official announcement, Google now handles:

Automatic document chunking (configurable token limits)
Embedding generation and storage
Vector search infrastructure
Metadata filtering and organization
Citation tracking via grounding metadata

You get all the RAG functionality without building any of the infrastructure.

Refactoring the Architecture

Here’s the comparison that made me reconsider everything:

Before: Custom Vertex RAG Pipeline

// Simplified version of what I had
class DocSyncService {
  async syncRepository(repo: Repository) {
    // 1. Fetch files from GitHub
    const files = await this.github.getFiles(repo);

    // 2. Check database for existing hashes
    const existingHashes = await this.db.getFileHashes(files);

    // 3. Filter changed files (3-tier deduplication)
    const changedFiles = await this.filterChanged(files, existingHashes);

    // 4. Upload to GCS with metadata
    await Promise.all(
      changedFiles.map(f => this.gcs.upload(f, {
        metadata: { sha256: this.hash(f.content) }
      }))
    );

    // 5. Import to Vertex RAG (async polling)
    const operation = await this.vertexRag.importFiles(changedFiles);
    await this.pollUntilComplete(operation); // ~5 minutes

    // 6. Update database tracking
    await this.db.updateSyncState(changedFiles);
  }
}

Seven database tables tracked sync states, file hashes, import operations, repository metadata, and more. Three-tier optimization to avoid unnecessary uploads. Async polling to monitor import completion.

After: Gemini File Search

// What it became
class DocSyncService {
  async syncRepository(repo: Repository) {
    // 1. Fetch files from GitHub
    const files = await this.github.getFiles(repo);

    // 2. Upload directly to file search store
    await Promise.all(
      files.map(f => this.gemini.fileSearchStores.uploadToFileSearchStore({
        file: f.content,
        metadata: { repo: repo.name, path: f.path }
      }))
    );

    // 3. Wait for completion
    await this.pollUntilComplete();
  }
}

No intermediate storage. No hash tracking. No three-tier optimization. The file search store handles deduplication internally. Database needs shrink from seven tables to two or three for basic tracking.

Phil Schmid’s implementation guide shows the practical reality: about 300-400 lines of code replace the 1,240 I had written.

The Pricing Reality: Same Cost, Less Complexity

This is the part that surprised me. I expected a managed service to cost more. It doesn’t.

Current Cost (Vertex AI RAG)

Based on my actual usage:

Component	Volume	Cost
Embeddings (indexing)	~50K tokens/month	~$0.01/month
Storage (GCS)	~3.3MB average	~$0.00/month
Retrieval queries	~7.7M tokens/month	Included in model costs
Total		~$0.01/month + inference

New Cost (Gemini File Search)

According to the pricing docs:

Component	Volume	Cost
Embeddings (indexing)	~50K tokens/month	~$0.01/month
Storage	~3.3MB average	Free
Retrieval queries	~7.7M tokens/month	Included in model costs
Total		~$0.01/month + inference

Identical pricing. The managed service costs the same as rolling your own.

But the real savings aren’t on the cloud bill. They’re in:

Development time: Weeks of building vs. hours of integration
Maintenance burden: No more monitoring three-tier sync logic
Operational complexity: Fewer moving parts to debug at 3 AM
Context switching: One service instead of coordinating three

The cloud costs are a rounding error. The engineering costs are real.

What You Lose in the Migration

File Search isn’t better at everything. Here’s what I gave up:

Custom chunking logic: My Vertex RAG setup could implement sophisticated chunking strategies. File Search gives you configurable token limits and overlap, but you can’t plug in custom algorithms.

Fine-grained access control: GCS buckets integrate with IAM at a granular level. File Search uses project-level permissions.

Multi-cloud flexibility: I was locked into Google Cloud either way, but custom GCS storage could theoretically be mirrored elsewhere. File Search is exclusively Google infrastructure.

Debug visibility: When something went wrong in my custom pipeline, I could inspect every step. File Search is a black box. If indexing fails, you get less diagnostic information.

For my use case (customer support docs for a small SaaS), none of these mattered. Your mileage may vary.

When to Use File Search vs. Custom RAG

File Search makes sense when:

You’re building on Google Cloud already
Your chunking needs are standard (token limits, overlap)
You want to ship faster and maintain less code
Your document volume fits within project limits (1GB-1TB depending on tier)

Build your own RAG pipeline when:

You need custom chunking algorithms (semantic, citation-aware, etc.)
You’re already invested in specific vector databases
You require multi-cloud architecture
You need fine-grained access control at the document level
Your scale exceeds File Search limits (20GB per store recommended)

Lessons Learned

I’m not sure how to feel about this. Part of me is relieved. The codebase gets simpler, I have less infrastructure to maintain, and the cost stays the same. That’s a clear win.

But another part of me remembers the weeks spent building that three-tier optimization system. The careful benchmarking of different chunking strategies. The satisfaction of seeing that 100% success rate in the cron logs.

That work wasn’t wasted. I learned a lot building it. But it was also over-engineering for the problem I actually had.

The hard part of software engineering isn’t writing code. It’s knowing when not to.

Google’s File Search API is good enough for most RAG use cases. That’s simultaneously disappointing and freeing. Disappointing because the interesting problem is solved. Freeing because I can focus on the parts of my product that actually differentiate it.

Sometimes the best architecture is the one you don’t have to build.

FAQ

Q: Can I migrate existing Vertex RAG pipelines to File Search?

Yes, but you’ll need to re-upload and re-index your documents. The underlying storage format is different. Budget time for the migration and testing, not just the code changes.

Q: Does File Search support the same file formats as the Files API?

File Search supports PDFs, Word docs, spreadsheets, code files, and numerous text formats. Check the official documentation for the complete list.

Q: What about the 48-hour file expiration in the regular Files API?

That’s the key difference. Documents uploaded to File Search stores persist indefinitely until you explicitly delete them. This makes them suitable for production RAG systems, not just prototypes.

Q: How many File Search stores can I create?

Current limit is 10 stores per project. Google recommends keeping individual stores under 20GB for optimal retrieval performance.

Q: Is the retrieval quality the same as custom Vertex RAG?

In my testing, leveraging Google’s ADK, yes. Both use similar embedding models and vector search techniques. The quality difference comes from your chunking strategy and document quality, not the infrastructure.