Back to Projects
AI/LLM Systems

Venture Capital RAG System

Production RAG system for querying a VC firm's internal knowledge base with natural language

Company: Venture Capital Firm
Year: 2025
Status: Production
2-4s
Query Speed
end-to-end response time for typical queries
<100ms
Retrieval
vector retrieval using pgvector indexing
100x faster
Time Saved
5-10min reduced to 3s for research
~30s/page
Processing
automated document processing pipeline

Built a production RAG system that allows teams to ask natural-language questions over internal investment documents, memos, and playbooks. The system retrieves relevant passages via vector search and generates grounded answers with source references, improving internal knowledge access and reuse. The system runs on AWS with an asynchronous ingestion pipeline that processes new documents, extracts text, and generates embeddings for retrieval. Designed for reliability and internal use rather than demo-style outputs, with emphasis on traceable answers and consistent behavior.

The hard part

Ensuring answer reliability was the main challenge. Investment teams need traceable responses, so the system was designed to prioritize retrieval grounding and source attribution over creative generation. This required careful prompt design, retrieval tuning, and document preprocessing so that answers consistently referenced real internal material.

What I did

Designed and implemented the end-to-end RAG architecture. Built the ingestion pipeline for document processing, text extraction, chunking, and embedding. Implemented the retrieval and generation pipeline, backend APIs, and document indexing workflows. Deployed and managed the system on AWS and supported iteration based on real user queries and feedback.

Tech

FastAPIPostgreSQL + pgvectorLLM APIs for embeddings and generationNext.jsAWS (S3, SQS, RDS, EC2)OCR tooling for document digitization