A minimal, production‑style Retrieval Augmented Generation (RAG) application. Users can upload a document (PDF, TXT, Markdown) and ask questions about its content through a chat interface.
The project focuses on correctness, clarity, and real‑world engineering practices rather than UI polish or experimental optimizations.
- Document upload (PDF, TXT, Markdown)
- Text chunking and embedding
- Vector storage using PostgreSQL + pgvector
- Semantic retrieval of relevant chunks
- Question answering using Gemini (free tier)
- Simple chat‑style frontend
- Python
- FastAPI
- SQLAlchemy (async)
- PostgreSQL + pgvector
- asyncpg
- Gemini (google‑genai SDK)
- React (Vite)
- Fetch API
- Plain CSS
rag-app/
├── backend/
│ ├── app/
│ │ ├── api/ # Route definitions
│ │ ├── core/ # DB and config
│ │ ├── models/ # ORM models
│ │ ├── services/ # Chunking, embeddings, retrieval, LLM
│ │ └── main.py # FastAPI app entry
│ └── requirements.txt
│
├── frontend/
│ ├── src/
│ │ ├── api/ # Backend API calls
│ │ ├── components/ # UI components
│ │ └── styles.css
│ └── package.json
│
└── README.md
- Python 3.10+
- Node.js 18+
- PostgreSQL 14+
- pgvector extension enabled
- Gemini API key (free tier)
cd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activatepip install -r requirements.txtCreate a .env file or export variables:
DATABASE_URL=postgresql+asyncpg://user:password@localhost:5432/rag_db
GEMINI_API_KEY=your_api_key_hereEnsure pgvector is enabled:
CREATE EXTENSION IF NOT EXISTS vector;uvicorn app.main:app --reloadBackend runs at: http://localhost:8000
cd frontend
npm installnpm run devFrontend runs at: http://localhost:5173
- Open the frontend in the browser
- Upload a PDF, TXT, or Markdown file
- Ask questions related to the document
- The system retrieves relevant content and generates an answer
- File uploaded via frontend
- Backend extracts text (PDF parser or raw text)
- Text is split into chunks
- Each chunk is embedded using Gemini embeddings
- Embeddings are stored in PostgreSQL (pgvector)
- User asks a question
- Question is embedded
- Vector similarity search retrieves relevant chunks
- Retrieved text is sent as context to the LLM
- LLM generates a grounded answer
- Async SQLAlchemy + asyncpg: avoids blocking I/O and aligns with FastAPI’s async model
- pgvector: simple, production‑ready vector storage without extra infrastructure
- Simple chunking: predictable behavior, easy to reason about
- No chat history storage: keeps scope aligned with assessment requirements
- Minimal frontend: focuses on usability and clarity, not visual polish
The optional “Accuracy and Performance Considerations” section from the assessment was intentionally not implemented.
- CORS is enabled for local development
- No paid APIs or proprietary services are used
This project is provided for assessment and educational purposes.