🧠 RagMind

A fully local, Retrieval-Augmented Generation (RAG) application that lets you upload PDF documents and ask natural language questions about them — powered by Groq, LangChain, and ChromaDB.

Your Question
     │
     ▼
Embed question into a vector
     │
     ▼
Search ChromaDB for similar chunks  ←── Your PDFs (pre-indexed)
     │
     ▼
Build prompt: question + relevant chunks
     │
     ▼
Groq LLaMA 3 generates a grounded answer
     │
     ▼
Answer + source citations shown to you

Features

Upload PDFs via a clean web UI — no terminal needed
Automatic indexing — chunking, embedding, and storage triggered on upload
Chat interface with persistent message history
Source citations — every answer shows which file and page it came from
Adjustable retrieval — control how many chunks are used per answer

Tech Stack

Component	Tool
Web UI	Streamlit
LLM (answer generation)	Groq — LLaMA 3.3 70B
Embeddings	HuggingFace `all-MiniLM-L6-v2`
Vector database	ChromaDB
PDF loader	LangChain + PyPDF
Package manager	uv

Project Structure

RagMind/
├── app.py          # Streamlit entry point — multi-page navigation
├── upload.py       # Page 1: Upload PDFs and trigger indexing
├── chat.py         # Page 2: Chat interface with source display
├── index.py        # Core indexing pipeline (load → chunk → embed → store)
├── query.py        # Core query pipeline (CLI version)
├── .env            # API keys and configuration
├── .gitignore      # Excludes chroma_db/, data/, .env
├── pyproject.toml  # uv project file with dependencies
├── data/           # Your uploaded PDF files (git ignored)
└── chroma_db/      # ChromaDB vector store (git ignored)

Run the app

uv run streamlit run app.py

Open http://localhost:8501 in your browser.

Usage

Step 1 — Upload your PDFs

Navigate to the Upload Docs page, drag and drop your PDF files, then click Save & Index Documents.

The app will:

Save your PDFs to ./data/
Split them into overlapping chunks
Generate embeddings locally (first run downloads ~90 MB model, then cached)
Store everything in ChromaDB

Step 2 — Ask questions

Navigate to the Ask Questions page and start chatting. Each answer includes:

The generated response grounded in your documents
Expandable Sources panel showing which file, page, and text excerpt was used

Sidebar options

Option	Description
Show source chunks	Toggle source citations on/off
Chunks to retrieve	How many passages to pull per question (1–8)
Clear chat history	Reset the conversation

Running the CLI version

If you prefer the terminal over the web UI:

# Index your PDFs first (add PDFs to ./data manually)
uv run python index.py

# Then ask questions interactively
uv run python query.py

Configuration

All settings are in .env:

# Groq API
GROQ_API_KEY=gsk_...
GROQ_MODEL=llama-3.3-70b-versatile

# Embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2

# Retrieval
RETRIEVAL_TOP_K=4    # number of chunks retrieved per question

Acknowledgements

LangChain — RAG framework
Groq — ultra-fast LLM inference
ChromaDB — local vector database
Streamlit — web UI framework
HuggingFace — open source embedding models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 RagMind

Features

Tech Stack

Project Structure

Run the app

Usage

Step 1 — Upload your PDFs

Step 2 — Ask questions

Sidebar options

Running the CLI version

Configuration

Acknowledgements

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
app.py		app.py
chat.py		chat.py
index.py		index.py
pyproject.toml		pyproject.toml
query.py		query.py
requirements.txt		requirements.txt
upload.py		upload.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

🧠 RagMind

Features

Tech Stack

Project Structure

Run the app

Usage

Step 1 — Upload your PDFs

Step 2 — Ask questions

Sidebar options

Running the CLI version

Configuration

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages