AI/LLM Systems

Venture Capital RAG System

Production RAG system for querying a VC firm's internal knowledge base with natural language

Company: Venture Capital Firm

Year: 2025

Status: Production

2-4s

Query Speed

end-to-end response time for typical queries

<100ms

Retrieval

vector retrieval using pgvector indexing

100x faster

Time Saved

5-10min reduced to 3s for research

~30s/page

Processing

automated document processing pipeline

Built a production RAG system that allows teams to ask natural-language questions over internal investment documents, memos, and playbooks. The system retrieves relevant passages via vector search and generates grounded answers with source references, improving internal knowledge access and reuse. The system runs on AWS with an asynchronous ingestion pipeline that processes new documents, extracts text, and generates embeddings for retrieval. Designed for reliability and internal use rather than demo-style outputs, with emphasis on traceable answers and consistent behavior.

The hard part

Ensuring answer reliability was the main challenge. Investment teams need traceable responses, so the system was designed to prioritize retrieval grounding and source attribution over creative generation. This required careful prompt design, retrieval tuning, and document preprocessing so that answers consistently referenced real internal material.

What I did

Designed and implemented the end-to-end RAG architecture. Built the ingestion pipeline for document processing, text extraction, chunking, and embedding. Implemented the retrieval and generation pipeline, backend APIs, and document indexing workflows. Deployed and managed the system on AWS and supported iteration based on real user queries and feedback.

Tech

FastAPIPostgreSQL + pgvectorLLM APIs for embeddings and generationNext.jsAWS (S3, SQS, RDS, EC2)OCR tooling for document digitization

View All Projects