asadsandhu commited on
Commit
ecf4549
Β·
1 Parent(s): 41a9c31

Finalized.

Browse files
Files changed (3) hide show
  1. README.md +34 -34
  2. assets/pp.py +0 -0
  3. requirements.txt +1 -3
README.md CHANGED
@@ -1,24 +1,24 @@
1
  ---
2
- title: RAGnosis
3
- emoji: πŸ‘
4
- colorFrom: red
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 5.35.0
8
- app_file: app.py
9
  pinned: false
10
- license: mit
11
- short_description: Clinical Query Answering with RAG + MIMIC-IV Notes.
12
  ---
13
 
14
- # 🩺 RAGnosis – Clinical Reasoning via Retrieval-Augmented Generation
15
 
16
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
17
  [![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://www.python.org/)
18
  [![Hugging Face](https://img.shields.io/badge/HuggingFace-RAGnosis-blue?logo=huggingface)](https://huggingface.co/spaces/asadsandhu/RAGnosis)
19
  [![GitHub Repo](https://img.shields.io/badge/GitHub-asadsandhu/RAG--Diagnostic--Assistant-black?logo=github)](https://github.com/asadsandhu/RAG-Diagnostic-Assistant)
20
 
21
- > βš•οΈ A fully offline-capable, Gradio-powered RAG assistant trained on **annotated clinical notes** from the [MIMIC-IV-Ext-DiReCT](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/mimic-iv-ext-direct-1.0.0.zip) dataset to perform explainable diagnostic reasoning.
22
 
23
  ---
24
 
@@ -37,21 +37,21 @@ Try it live on **Hugging Face Spaces** πŸ‘‰
37
 
38
  | Layer | Details |
39
  |--------------|-------------------------------------------------------------------------|
40
- | 🧠 Model | [`Nous-Hermes-2-Mistral-7B-DPO`](https://huggingface.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO) |
41
  | πŸ₯ Dataset | [`MIMIC-IV-Ext-DiReCT`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/mimic-iv-ext-direct-1.0.0.zip) |
42
  | πŸ” Retriever | FAISS + SentenceTransformers (`all-MiniLM-L6-v2`) |
43
  | πŸ’» Frontend | Gradio (Hugging Face Spaces) |
44
- | 🧠 Backend | PyTorch + Transformers + BitsAndBytes |
45
 
46
  ---
47
 
48
  ## πŸš€ Features
49
 
50
- - πŸ”Ž Top-k document retrieval from real annotated clinical notes
51
- - πŸ“‹ Reasoning based on structured diagnostic chains
52
- - 🧠 GPT-style generation from LLM (Mistral 7B) without internet dependency
53
- - 🧾 Clean Gradio interface for natural medical queries
54
- - 🧠 Answers explained like a clinical reasoning expert
55
 
56
  ---
57
 
@@ -59,25 +59,24 @@ Try it live on **Hugging Face Spaces** πŸ‘‰
59
 
60
  > *Patient presents with fatigue, orthopnea, and lower extremity edema.*
61
 
62
- πŸ’¬ **Model response:**
63
- > Based on the patient's symptoms and context, the most likely diagnosis is **congestive heart failure (CHF)**...
64
 
65
  ---
66
 
67
  ## πŸ›  How It Works
68
 
69
- ### βœ… Step 1: Preprocessing
70
- - Extract chains from `samples/` and `diagnostic_kg/`
71
- - Build retrievable clinical observations + diagnoses
 
72
 
73
- ### βœ… Step 2: Retrieval (FAISS)
74
- - Embed notes using `MiniLM-L6-v2`
75
- - Save as FAISS index β†’ [`faiss_index.bin`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/faiss_index.bin)
76
- - Paired with β†’ [`retrieval_corpus.csv`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/retrieval_corpus.csv)
77
 
78
- ### βœ… Step 3: Generation
79
- - Format prompt in `[INST]` syntax
80
- - Generate diagnosis using `Nous-Hermes-2-Mistral-7B-DPO`
81
 
82
  ---
83
 
@@ -133,11 +132,12 @@ This project is under the [MIT License](LICENSE).
133
 
134
  ## πŸ™ Acknowledgments
135
 
136
- * MIMIC-IV-Ext-DiReCT: Annotated diagnostic data
137
- * Hugging Face Transformers + Gradio
138
  * Facebook Research – FAISS
139
- * Nous Research – Instruction-tuned Mistral model
 
140
 
141
  ---
142
 
143
- > ⚠️ *Disclaimer: This project is for research/demo use only. Not intended for clinical decision-making.*
 
1
  ---
2
+ title: "RAGnosis"
3
+ emoji: "🧠"
4
+ colorFrom: "red"
5
+ colorTo: "indigo"
6
+ sdk: "gradio"
7
+ sdk_version: "5.35.0"
8
+ app_file: "app.py"
9
  pinned: false
10
+ license: "mit"
11
+ short_description: "Clinical Query Answering with Retrieval-Augmented Generation (RAG) and MIMIC-IV Notes."
12
  ---
13
 
14
+ # 🧠 RAGnosis – Clinical Reasoning via Retrieval-Augmented Generation
15
 
16
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
17
  [![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)](https://www.python.org/)
18
  [![Hugging Face](https://img.shields.io/badge/HuggingFace-RAGnosis-blue?logo=huggingface)](https://huggingface.co/spaces/asadsandhu/RAGnosis)
19
  [![GitHub Repo](https://img.shields.io/badge/GitHub-asadsandhu/RAG--Diagnostic--Assistant-black?logo=github)](https://github.com/asadsandhu/RAG-Diagnostic-Assistant)
20
 
21
+ > βš•οΈ A CPU-ready, Gradio-powered RAG assistant for explainable **clinical diagnosis** using annotated notes from the [MIMIC-IV-Ext-DiReCT](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/mimic-iv-ext-direct-1.0.0.zip) dataset.
22
 
23
  ---
24
 
 
37
 
38
  | Layer | Details |
39
  |--------------|-------------------------------------------------------------------------|
40
+ | 🧠 Model | [`BioMistral/BioMistral-7B`](https://huggingface.co/BioMistral/BioMistral-7B) |
41
  | πŸ₯ Dataset | [`MIMIC-IV-Ext-DiReCT`](https://github.com/asadsandhu/RAG-Diagnostic-Assistant/blob/main/mimic-iv-ext-direct-1.0.0.zip) |
42
  | πŸ” Retriever | FAISS + SentenceTransformers (`all-MiniLM-L6-v2`) |
43
  | πŸ’» Frontend | Gradio (Hugging Face Spaces) |
44
+ | 🧠 Backend | PyTorch + Transformers (no quantization) |
45
 
46
  ---
47
 
48
  ## πŸš€ Features
49
 
50
+ - πŸ” Top-k retrieval from real clinical notes and diagnostic pathways
51
+ - πŸ“‹ Structured reasoning with evidence from retrieved facts
52
+ - 🧠 Generation powered by domain-specific BioMistral-7B LLM
53
+ - πŸ’¬ Natural question answering with clear clinical explanations
54
+ - βš™οΈ Hugging Face Spaces-friendly: runs on CPU within 16GB RAM
55
 
56
  ---
57
 
 
59
 
60
  > *Patient presents with fatigue, orthopnea, and lower extremity edema.*
61
 
62
+ πŸ’¬ **Model response:**
63
+ > Based on the patient's symptoms and retrieved clinical facts, the most likely diagnosis is **congestive heart failure (CHF)**...
64
 
65
  ---
66
 
67
  ## πŸ›  How It Works
68
 
69
+ ### βœ… Step 1: Retrieval (FAISS)
70
+ - Sentence embeddings generated using `all-MiniLM-L6-v2`
71
+ - Indexed with FAISS (`faiss_index.bin`)
72
+ - Source corpus: `retrieval_corpus.csv`
73
 
74
+ ### βœ… Step 2: Prompt Construction
75
+ - Query + top-5 chunks formatted into a clinical instruction prompt
 
 
76
 
77
+ ### βœ… Step 3: Generation (LLM)
78
+ - Prompt fed to `BioMistral/BioMistral-7B`
79
+ - Diagnosis + explanation generated using `generate()` (no GPU needed)
80
 
81
  ---
82
 
 
132
 
133
  ## πŸ™ Acknowledgments
134
 
135
+ * MIMIC-IV-Ext-DiReCT: Annotated diagnostic corpus
136
+ * Hugging Face Transformers & SentenceTransformers
137
  * Facebook Research – FAISS
138
+ * Gradio for UI
139
+ * BioMistral for domain-aligned LLM
140
 
141
  ---
142
 
143
+ > ⚠️ *Disclaimer: This project is for academic demonstration only. It is not approved for clinical use.*
assets/pp.py ADDED
File without changes
requirements.txt CHANGED
@@ -4,6 +4,4 @@ faiss-cpu
4
  torch
5
  gradio
6
  accelerate
7
- sentencepiece
8
- bitsandbytes
9
- blobfile
 
4
  torch
5
  gradio
6
  accelerate
7
+ sentencepiece