🏥 Medical Chatbot — RAG-Powered Q&A

A production-ready Retrieval-Augmented Generation (RAG) medical chatbot that answers clinical questions by retrieving relevant context from a curated medical knowledge base, powered by Google Gemini and Pinecone vector search.

📖 About

This project is a full-stack AI medical chatbot built using state-of-the-art LLM and vector database technologies. It ingests medical reference PDFs, indexes them into a Pinecone vector store, and uses Google Gemini 2.5 Flash to answer clinical questions with context-grounded, hallucination-resistant responses.

The system follows a clean RAG (Retrieval-Augmented Generation) architecture:

Ingest: Medical PDFs are chunked, embedded, and stored in Pinecone
Retrieve: User queries are matched against stored vectors using cosine similarity
Generate: Gemini LLM synthesizes a concise, grounded answer using the retrieved context
Serve: A Flask REST API and HTML chat UI provide a complete user interface

Built as a portfolio project to demonstrate expertise in LangChain, LLMs, vector databases, and production-grade Python web applications.

🧠 How It Works — RAG Pipeline

Medical PDFs  ──▶  load_pdf_file()          ← LangChain DirectoryLoader + PyPDFLoader
                       │
                   filter_to_minimal_docs()  ← strips metadata noise
                       │
                   text_split()             ← RecursiveCharacterTextSplitter (500 tokens, 20 overlap)
                       │
                   download_embedding()     ← sentence-transformers/all-MiniLM-L6-v2 (384d)
                       │
                   PineconeVectorStore      ← stores + indexes embeddings
                       │
              ┌────────┘
User Query ──▶│  similarity_search (k=3)
              └──▶  create_retrieval_chain()
                       │
                   ChatGoogleGenerativeAI   ← Gemini 2.5 Flash (temp=0.3)
                       │
                   Flask API  /chat         ← JSON response
                       │
                   chat.html UI             ← real-time chat interface

✨ Features

📄 PDF Knowledge Ingestion — bulk-load any medical reference PDFs into Pinecone
🔍 Semantic Search — cosine similarity retrieval (k=3 most relevant chunks)
🤖 Gemini 2.5 Flash LLM — fast, accurate, grounded answers with source context
🛡️ Hallucination Guard — explicitly says "I don't know" when context is insufficient
🌐 Flask REST API — lightweight /chat endpoint for easy integration
💬 Chat UI — clean HTML/JS chat interface

🚀 Quick Start

1. Clone the repository

git clone https://github.com/Ashwin14101/Medical-Chatbot-With-LLMs-LangChain-Pinecone-Flask-AWS.git
cd Medical-Chatbot-With-LLMs-LangChain-Pinecone-Flask-AWS

2. Install dependencies

pip install -r requirement.txt

3. Configure API keys

cp .env.example .env

Edit .env:

PINECONE_API_KEY=your_pinecone_api_key
GOOGLE_API_KEY=your_google_api_key

Get your keys:

Pinecone → https://app.pinecone.io

Google AI Studio → https://aistudio.google.com/app/apikey

4. Add your medical PDFs

Place your PDF files in the data/ directory.

5. Build the vector index (run once)

python store_index.py

This will:

Load & chunk your PDFs
Generate embeddings with all-MiniLM-L6-v2
Create a Pinecone index named medical-bot
Upsert all vectors

6. Launch the chatbot

python app.py

Open http://localhost:8080 in your browser.

📁 Project Structure

Medical-Chatbot-With-LLMs-LangChain-Pinecone-Flask-AWS/
├── app.py                  # Original Flask app (basic version)
├── app1.py                 # Updated Flask app with JSON API + improved comments
├── store_index.py          # One-time PDF ingestion + Pinecone indexing script
├── src/
│   ├── helper.py           # PDF loader, text splitter, embedding model
│   └── prompt.py           # System prompt for the medical assistant
├── templates/
│   └── chat.html           # Frontend chat UI
├── Static/                 # CSS / JS assets
├── data/                   # Place your medical PDFs here (gitignored)
├── research/
│   └── trials.ipynb        # Jupyter notebook for experimentation
├── requirement.txt         # Python dependencies
├── setup.py                # Package setup
├── template.sh             # Project scaffolding script
├── .env.example            # Environment variable template
└── README.md

🛠️ Tech Stack

Layer	Technology
Backend	Flask 3.x
LLM	Google Gemini 2.5 Flash via `langchain-google-genai`
Vector DB	Pinecone (Serverless, AWS us-east-1, cosine, dim=384)
Embeddings	`sentence-transformers/all-MiniLM-L6-v2` (HuggingFace)
RAG Framework	LangChain (`create_retrieval_chain` + `create_stuff_documents_chain`)
PDF Parsing	LangChain `PyPDFLoader` + `DirectoryLoader`

⚙️ Configuration

Parameter	Value	Notes
Chunk size	500 tokens	Balances context vs. retrieval precision
Chunk overlap	20 tokens	Prevents information loss at boundaries
Retrieval k	3	Top-3 most similar chunks per query
LLM temperature	0.3	Low temperature for factual, deterministic answers
Embedding dim	384	Matches `all-MiniLM-L6-v2` output size

🤝 Contributing

Pull requests are welcome! Please open an issue first for major changes.

👤 Author

Ashwin Kotha

GitHub: @Ashwin14101
Project: Medical-Chatbot-With-LLMs-LangChain-Pinecone-Flask-AWS

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.devcontainer		.devcontainer
Static		Static
data		data
medibot-py310		medibot-py310
research		research
src		src
templates		templates
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
app.py		app.py
app1.py		app1.py
render.yaml		render.yaml
requirements.txt		requirements.txt
runtime.txt		runtime.txt
setup.py		setup.py
store_index.py		store_index.py
streamlit_app.py		streamlit_app.py
template.sh		template.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏥 Medical Chatbot — RAG-Powered Q&A

📖 About

🧠 How It Works — RAG Pipeline

✨ Features

🚀 Quick Start

1. Clone the repository

2. Install dependencies

3. Configure API keys

4. Add your medical PDFs

5. Build the vector index (run once)

6. Launch the chatbot

📁 Project Structure

🛠️ Tech Stack

⚙️ Configuration

🤝 Contributing

👤 Author

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🏥 Medical Chatbot — RAG-Powered Q&A

📖 About

🧠 How It Works — RAG Pipeline

✨ Features

🚀 Quick Start

1. Clone the repository

2. Install dependencies

3. Configure API keys

4. Add your medical PDFs

5. Build the vector index (run once)

6. Launch the chatbot

📁 Project Structure

🛠️ Tech Stack

⚙️ Configuration

🤝 Contributing

👤 Author

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages