File size: 4,614 Bytes
34dc89e
 
 
 
 
 
 
 
 
 
 
 
 
4d88a84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
01a1645
4d88a84
 
 
 
 
 
 
 
 
 
 
 
 
 
01a1645
 
4d88a84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
title: Personal ChatBot
emoji: ๐Ÿ’ฌ
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.0.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Krishna's Persona Chat Bot using Multi RAG network
---

# ๐Ÿง  Krishna's Personal AI Chatbot

A memory-grounded, retrieval-augmented AI assistant built with LangChain, FAISS, BM25, and Llama3 โ€” personalized to Krishna Vamsi Dhulipallaโ€™s career, projects, and technical profile.

> โšก๏ธ Ask me anything about Krishna โ€” skills, experience, goals, or even what tools he used at Virginia Tech.

---

## ๐Ÿ“Œ Features

- โœ… **Hybrid Retrieval**: Combines dense vector search (FAISS) + keyword search (BM25) for precise, high-recall chunk selection
- ๐Ÿค– **LLM-Powered Pipelines**: Uses OpenAI GPT-4o and NVIDIA NIMs (e.g. LLaMA-3, Mixtral) for rewriting, validation, and final answer generation
- ๐Ÿง  **Memory Module**: Stores user preferences, recent topics, and inferred tone using a structured `KnowledgeBase` schema
- ๐Ÿ› ๏ธ **Custom Architecture**:
  - Query โ†’ Rewriting โ†’ Hybrid Retriever โ†’ Scope Validator โ†’ LLM Answer
  - Fallback humor model (Mixtral) for out-of-scope queries
- ๐Ÿงฉ **Document Grounding**: Powered by Krishnaโ€™s actual markdown files like `profile.md`, `goals.md`, and `chatbot_architecture.md`
- ๐Ÿ“Š **Enriched Vector Store**: Chunks include LLM-generated summaries and synthetic queries for better search performance
- ๐ŸŽ›๏ธ **Gradio Frontend**: Responsive, markdown-formatted interface for natural, real-time interaction

---

## ๐Ÿ—๏ธ Architecture

```text
User Query
   โ†“
[LLM1] โ†’ Rephrase into 3 diverse subqueries
   โ†“
Hybrid Retrieval (BM25 + FAISS)
   โ†“
[LLM2] โ†’ Classify: In-scope or Out-of-scope
   โ†“
   โ”œโ”€ In-scope โ†’ Top-k Chunks โ†’ GPT-4o
   โ””โ”€ Out-of-scope โ†’ Mixtral (funny fallback)
   โ†“
Final Answer + Async Memory Update
```

---

## ๐Ÿ“‚ Project Structure

```
.
โ”œโ”€โ”€ app.py                      # Main Gradio app and pipeline logic
โ”œโ”€โ”€ Vector_storing.py          # Chunking, LLM-based enrichment, and FAISS store creation
โ”œโ”€โ”€ requirements.txt           # Python package dependencies
โ”œโ”€โ”€ faiss_store/               # Saved FAISS vector index
โ”œโ”€โ”€ all_chunks.json            # JSON of enriched document chunks
โ”œโ”€โ”€ personal_data/             # Source markdown files (right now excluded)
โ”œโ”€โ”€ README.md
```

---

## ๐Ÿง  Knowledge Sources

All answers are grounded in curated markdown files:

| File Name                 | Description                                    |
| ------------------------- | ---------------------------------------------- |
| `profile.md`              | Krishnaโ€™s full technical profile and education |
| `goals.md`                | Short- and long-term personal goals            |
| `chatbot_architecture.md` | System-level breakdown of this AI assistant    |
| `personal_interests.md`   | Hobbies, cultural identity, food preferences   |
| `conversations.md`        | Sample queries and expected response tone      |

---

## ๐Ÿงช How It Works

1. **User input** is rewritten into subqueries (LLM1)
2. **Retriever** fetches relevant chunks using BM25 and FAISS
3. **Classifier LLM** decides if results are relevant to Krishna
4. **GPT-4o** generates final answer using top-k chunks
5. **Memory is updated** asynchronously with every turn

---

## ๐Ÿ’ฌ Example Queries

- What programming languages does Krishna know?
- Tell me about Krishnaโ€™s chatbot architecture
- Can this chatbot explain Krishna's work at Virginia Tech?
- What tools has Krishna used for data engineering?

---

## ๐Ÿš€ Setup & Usage

```bash
# 1. Clone the repo
git clone https://github.com/krishna-creator/krishna-personal-chatbot.git
cd krishna-personal-chatbot

# 2. Install dependencies
pip install -r requirements.txt

# 3. Set your API keys (OpenAI, NVIDIA)
export OPENAI_API_KEY=...
export NVIDIA_API_KEY=...

# 4. Launch the chatbot
python app.py
```

---

## ๐Ÿ”ฎ Model Stack

| Purpose            | Model Name               | Provider |
| ------------------ | ------------------------ | -------- |
| Query Rewriting    | `phi-3-mini-4k-instruct` | NVIDIA   |
| Scope Classifier   | `llama-3-70b-instruct`   | NVIDIA   |
| Answer Generator   | `gpt-4o`                 | OpenAI   |
| Fallback Humor LLM | `mixtral-8x22b-instruct` | NVIDIA   |

---

## ๐Ÿ“Œ Acknowledgments

- Built as part of Krishna's exploration into **LLM orchestration and agentic RAG**
- Inspired by LangChain, SentenceTransformers, and NVIDIA RAG Agents Course

---

## ๐Ÿ“œ License

MIT License ยฉ Krishna Vamsi Dhulipalla