File size: 6,007 Bytes
3a8a968
ddfc072
 
 
 
 
 
 
3a8a968
ddfc072
 
3a8a968
 
ecf4549
ff04da1
 
 
 
 
 
02424b5
ff04da1
 
 
02424b5
 
 
ff04da1
 
 
 
 
 
 
 
 
02424b5
ff04da1
02424b5
 
 
 
 
 
 
ff04da1
 
 
02424b5
ff04da1
02424b5
 
 
 
 
ff04da1
 
 
02424b5
ff04da1
02424b5
ff04da1
ecf4549
02424b5
ff04da1
 
 
02424b5
ff04da1
02424b5
 
 
 
ff04da1
02424b5
 
ff04da1
02424b5
 
 
ff04da1
 
 
02424b5
 
 
 
 
 
 
 
 
 
 
 
 
 
ff04da1
 
 
 
 
 
 
 
 
02424b5
ff04da1
 
 
02424b5
ff04da1
02424b5
 
ff04da1
 
 
02424b5
ff04da1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
02424b5
 
 
 
 
ff04da1
 
 
02424b5
ff04da1
65a107c
ff04da1
 
 
 
 
02424b5
 
 
 
ff04da1
 
 
02424b5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
title: RAGnosis
emoji: 🧠
colorFrom: red
colorTo: indigo
sdk: gradio
sdk_version: 5.36.2
app_file: app.py
pinned: false
license: mit
short_description: Clinical Query Answering with RAG + MIMIC-IV Notes.
---

# 🧠 RAGnosis – Clinical Reasoning via Retrieval-Augmented Generation

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://www.python.org/)
[![Hugging Face](https://img.shields.io/badge/HuggingFace-RAGnosis-blue?logo=huggingface)](https://huggingface.co/spaces/asadsandhu/RAGnosis)
[![GitHub Repo](https://img.shields.io/badge/GitHub-asadsandhu/RAG--Diagnostic--Assistant-black?logo=github)](https://github.com/asadsandhu/RAG-Diagnostic-Assistant)

> βš•οΈ A CPU-friendly RAG assistant for explainable clinical diagnosis, built using annotated notes from the [MIMIC-IV-Ext-DiReCT](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/mimic-iv-ext-direct-1.0.0.zip) dataset.

---

## πŸ–ΌοΈ Live Demo

Try it on **Hugging Face Spaces** (CPU-based, ~500s latency) πŸ‘‰  :

πŸ”— https://huggingface.co/spaces/asadsandhu/RAGnosis

<p align="center">
  <img src="assets/demo.png" alt="Demo" width="750">
</p>

---

## βš™οΈ Technical Stack

| Component      | Details                                                                 |
|----------------|-------------------------------------------------------------------------|
| 🧠 Model        | [`BioMistral/BioMistral-7B`](https://huggingface.co/BioMistral/BioMistral-7B) |
| πŸ₯ Dataset      | [`MIMIC-IV-Ext-DiReCT`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/mimic-iv-ext-direct-1.0.0.zip) |
| πŸ” Retrieval    | FAISS + SentenceTransformers (`all-MiniLM-L6-v2`)                      |
| πŸ’¬ Interface    | Gradio (Runs on Hugging Face Spaces)                                   |
| βš™οΈ Backend      | PyTorch + Transformers (CPU inference)                                 |

---

## 🩺 Key Features

- πŸ”Ž Top-k clinical note retrieval from structured diagnostic chains  
- 🧠 Diagnosis generation using `BioMistral-7B` without GPU  
- πŸ—‚οΈ MIMIC-IV-Ext-DiReCT backed medical reasoning  
- πŸ’¬ Natural query input and interpretable LLM output  
- βœ… Optimized for Hugging Face's free CPU tier (16GB RAM)

---

## πŸ§ͺ Example Prompt

> *patient is experiencing shortness of breath*

πŸ’¬ **Model response:**  
> Shortness of breath is a common symptom that can be caused by a variety of respiratory conditions. The differential diagnosis for shortness of breath includes asthma, chronic obstructive pulmonary disease (COPD), congestive heart failure, pneumonia, and pneumothorax. In order to determine the cause of the shortness of breath, it is important to consider the patient's medical history, physical examination findings, and diagnostic testing results. For example, if the patient has a history of asthma and is experiencing wheezing and a prolonged expiratory phase on examination, this would suggest asthma as the cause of the shortness of breath. On the other hand, if the patient has a history of congestive heart failure and is experiencing orthopnea, crackles on auscultation, and a history of edema, this would suggest congestive heart failure as the cause of the shortness of breath.

---

## πŸ”§ Under the Hood

### 🧠 Step 1: Retrieval
- Sentence embeddings (`MiniLM-L6-v2`)
- FAISS indexing β†’ [`faiss_index.bin`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/faiss_index.bin)
- Source: [`retrieval_corpus.csv`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/retrieval_corpus.csv)

### 🧱 Step 2: Prompt Construction
- Clinical query + retrieved chunks form an instruction prompt

### 🧬 Step 3: Generation
- Uses `BioMistral/BioMistral-7B`
- Generated on CPU (expect ~500 seconds)

---

## ⚠️ Note on Performance

| Feature      | Hugging Face (CPU)        | Colab / Local (GPU)                  |
|--------------|----------------------------|--------------------------------------|
| Model Used   | `BioMistral-7B`            | `Nous-Hermes-2-Mistral-7B-DPO`       |
| Speed        | ~500 seconds/query 🐒      | ~10 seconds/query ⚑                 |
| Accuracy     | βœ… Good                     | βœ… Great                             |
| Requirements | 1 CPU (free tier)          | CUDA GPU (Colab/Kaggle/local)        |

> Want faster responses? Use the [GPU version on GitHub](https://github.com/asadsandhu/RAG-Diagnostic-Assistant) with `app.py`.

---

## βš™οΈ Run It Locally (GPU Recommended)

```bash
# 1. Clone the repository
git clone https://github.com/asadsandhu/RAG-Diagnostic-Assistant.git
cd RAG-Diagnostic-Assistant

# 2. Install dependencies
pip install -r requirements.txt

# 3. Run the app (auto-detects GPU)
python app.py
````

Required files (already in repo):

* βœ… [`retrieval_corpus.csv`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/retrieval_corpus.csv)
* βœ… [`faiss_index.bin`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/faiss_index.bin)

---

## πŸ“ Project Structure

```
RAG-Diagnostic-Assistant/
β”œβ”€β”€ app.py
β”œβ”€β”€ faiss_index.bin
β”œβ”€β”€ retrieval_corpus.csv
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ assets/
β”‚   └── demo.png
└── README.md
```

---

## πŸ‘€ Author

Built with ❀️ by [Asad Ali](https://www.linkedin.com/in/asadsandhu0/)

* 🧠 Medium: [@asadsandhu](https://medium.com/@asadsandhu)
* πŸ’» GitHub: [@asadsandhu](https://github.com/asadsandhu)
* πŸ€— Hugging Face: [@asadsandhu](https://huggingface.co/asadsandhu)

---

## πŸ“œ License

MIT License β€” see [LICENSE](https://opensource.org/license/mit)

---

## πŸ™ Acknowledgments

* MIMIC-IV-Ext-DiReCT – Annotated diagnostic dataset
* Hugging Face Transformers + Gradio
* SentenceTransformers + FAISS
* BioMistral + NousResearch

---

> ⚠️ **Disclaimer**: This is a research prototype and not intended for clinical decision-making.