File size: 2,485 Bytes
9e89ad1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fafd568
41b319d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53f047d
 
feb9951
53f047d
 
 
 
 
 
feb9951
53f047d
feb9951
 
53f047d
 
feb9951
53f047d
41b319d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53f047d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
title: UB Virtual Student Assistant (UB-VSA)
emoji: 🎓
colorFrom: pink
colorTo: blue
sdk: docker
pinned: false
short_description: AI assistant for University at Buffalo students.
---

# UB Virtual Student Assistant (UB-VSA)  
**RAG-powered chatbot for UB admissions, visa & campus life questions**


UB-VSA is a **retrieval-augmented conversational agent** that gives accurate, source-grounded answers about the University at Buffalo, with special focus on the needs of international students (F-1/OPT, I-20, course selection, housing, etc.).
It combines:

| Layer | Tech |
|-------|------|
| **Retriever** | FAISS dense vectors (`BAAI/bge-small-en-v1.5`) + keyword hybrid search |
| **Reranker** | `cross-encoder/ms-marco-MiniLM-L-6-v2` |
| **Fine-tuning** | LoRA / QLoRA adapters on `microsoft/phi-1_5` with RAFT |
| **API** | Flask (backend) |
| **UI** | Flask (+ vanilla JS) single-page chat |
---

## ✨ Features
* **Context-aware chat** – tracks the last few turns for coherent follow-ups.  
* **Grounded citations** – every answer lists clickable sources.  
* **Lightweight adapters** – fine-tuned with LoRA/QLoRA; base model stays frozen.  
* **One-click deploy** – Dockerfile + `start.sh` spin up scraper, vector-store, API & UI.  
* **Scalable vector DB** – FAISS index stored in `data/embeddings/`.  

---

## 🖥️ Online Demo
> 👉 **Try it live:** https://huggingface.co/spaces/TeamSAS/UB_VSA

We’re on the free HF hardware tier; first response may take ~30 s while the container wakes up.

---

## 🏗️ Project Structure
```
buffalo_rag/
 ├── api/             # FastAPI routes & schemas
 ├── embeddings/      # Chunker + sentence‑transformer encoder
 ├── frontend/        # Flask templates & static JS/CSS
 ├── model/           # BuffaloRAG orchestration
 ├── scraper/         # Playwright / BeautifulSoup crawler
 └── vector_store/    # FAISS DB + hybrid search logic
data/
 └── embeddings/      # .pkl embeddings + faiss_index.pkl (generated)
Dockerfile
start.sh
main.py               # CLI pipeline entry‑point
```

##  Quick Start (Local)

```bash
git clone https://huggingface.co/spaces/TeamSAS/UB_VSA
cd UB_VSA

# 1. Python deps
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

export HUGGINGFACEHUB_API_TOKEN=hf_your_token_here

# 3. Build embeddings (only first time, ~5 min)
python main.py --build

# 4. Launch backend + frontend
python main.py --run
```