Update README.md
Browse files
README.md
CHANGED
@@ -10,120 +10,102 @@ pinned: false
|
|
10 |
short_description: Research
|
11 |
---
|
12 |
|
13 |
-
MedGenesis AI
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
π³ Docker-Based Deployment
|
51 |
-
MedGenesis AI uses a Dockerfile for guaranteed reproducibility and robust dependency management.
|
52 |
-
|
53 |
-
Run Locally:
|
54 |
bash
|
55 |
-
Copy
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
|
|
|
70 |
|
71 |
-
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
95 |
-
|
96 |
-
|
97 |
-
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
π§βπ» Author
|
102 |
-
Oluwafemi Idiakhoa
|
103 |
-
|
104 |
-
π License
|
105 |
-
MIT License
|
106 |
-
|
107 |
-
β€οΈ Acknowledgments
|
108 |
-
Hugging Face Spaces
|
109 |
-
|
110 |
-
PubMed, arXiv, OpenFDA, UMLS
|
111 |
-
|
112 |
-
OpenAI GPT-4o
|
113 |
-
|
114 |
-
The open-source biomedical AI community
|
115 |
|
116 |
-
|
117 |
-
|
118 |
|
119 |
-
|
120 |
|
121 |
-
|
122 |
|
123 |
-
|
124 |
|
125 |
-
|
|
|
|
|
126 |
|
127 |
-
|
128 |
-
If youβre using Docker, all dependencies (including Jinja2, pyvis) are installed as specified in the Dockerfile and requirements.txtβno extra setup needed.
|
129 |
-
If deploying with Hugging Face SDK mode, ensure your requirements.txt matches your dependencies.
|
|
|
10 |
short_description: Research
|
11 |
---
|
12 |
|
13 |
+
MedGenesis AI
|
14 |
+
MedGenesis AI is a biomedical literature discovery workbench that unifies live data from PubMed, arXiv, MyGene.info, ClinicalTrials.gov v2, DisGeNET, openFDA, Open Targets, DrugCentral, UMLS and moreβthen lets you explore the evidence in a rich Streamlit interface powered by OpenAI or Gemini LLMs.
|
15 |
+
|
16 |
+
|
17 |
+
ββββββββββββββββββββββββββββββββββββββββββββββ
|
18 |
+
β Streamlit UI (app.py) β
|
19 |
+
β β’ Results / Genes / Trials / Graph tabs β
|
20 |
+
β β’ PDF / CSV export & follow-up Q&A β
|
21 |
+
ββββββββββββββββ¬ββββββββββββββββββββββββββββββ
|
22 |
+
β async calls
|
23 |
+
ββββββββββββββββΌββββββββββββββββββββββββββββββ
|
24 |
+
β Orchestrator (mcp/orchestrator.py) β
|
25 |
+
β β’ pulls PubMed, arXiv β
|
26 |
+
β β’ keyword extraction (spaCy) β
|
27 |
+
β β’ fans-out to MyGene, CT.gov v2, UMLSβ¦ β
|
28 |
+
β β’ merges & summarises with LLM β
|
29 |
+
ββββββββββββββββ¬ββββββββββββββββββββββββββββββ
|
30 |
+
β helpers (mcp/*.py)
|
31 |
+
βΌ
|
32 |
+
ββββββββββββββββββββββββββββββββββββββββββββββ
|
33 |
+
β External APIs + local TSV (DrugBank) β
|
34 |
+
ββββββββββββββββββββββββββββββββββββββββββββββ
|
35 |
+
|
36 |
+
|
37 |
+
|
38 |
+
π Features
|
39 |
+
Domain Source / API What you get
|
40 |
+
Literature PubMed + arXiv titles, abstracts, authors, year
|
41 |
+
Gene info MyGene.info + NCBI Gene symbol, name, GO, ClinVar, MeSH definitions
|
42 |
+
Trials ClinicalTrials.gov v2 NCT ID, phase, status, start date
|
43 |
+
Disease β gene DisGeNET top associations & scores
|
44 |
+
Drug safety openFDA, DrugCentral adverse events, approvals, MoA
|
45 |
+
Graph edges Open Targets GraphQL geneβdisease-drug links (+ OT score)
|
46 |
+
Ontology UMLS, HPO, Wikidata concept CUI, phenotype look-ups
|
47 |
+
|
48 |
+
π Quick start
|
|
|
|
|
|
|
|
|
|
|
49 |
bash
|
50 |
+
Copy
|
51 |
+
Edit
|
52 |
+
# clone repo
|
53 |
+
git clone https://github.com/your-org/medgenesis.git
|
54 |
+
cd medgenesis
|
55 |
+
|
56 |
+
# build & run locally
|
57 |
+
python -m venv .venv && source .venv/bin/activate
|
58 |
+
pip install -r requirements.txt
|
59 |
+
python -m spacy download en_core_web_sm
|
60 |
+
streamlit run app.py
|
61 |
+
app.py starts a Streamlit server on localhost:8501.
|
62 |
+
Enter a biomedical question (e.g. βCRISPR glioblastoma therapyβ) and press Run Search π.
|
63 |
+
|
64 |
+
π³ Docker / Hugging Face Space
|
65 |
+
The included Dockerfile is CPU-only and downloads the spaCy model at build time:
|
66 |
|
67 |
+
bash
|
68 |
+
Copy
|
69 |
+
Edit
|
70 |
+
docker build -t medgenesis .
|
71 |
+
docker run -p 7860:7860 -e OPENAI_API_KEY=sk-... medgenesis
|
72 |
+
HF Spaces: push the repo, set the environment secrets below, and Spaces will pick up Dockerfile.
|
73 |
+
|
74 |
+
π Environment variables
|
75 |
+
Variable Description
|
76 |
+
OPENAI_API_KEY OpenAI account key (GPT-4o, GPT-4o-mini β¦)
|
77 |
+
GEMINI_KEY Google Generative AI key (Gemini 1.5 Flash)
|
78 |
+
UMLS_KEY UMLS Licensing key (ticket auth)
|
79 |
+
DISGENET_KEY DisGeNET Bearer token (optional)
|
80 |
+
PUB_KEY NCBI E-utils key (optional, boosts quota)
|
81 |
+
BIO_KEY NCBI E-utils key for Gene/MeSH (optional)
|
82 |
+
|
83 |
+
Set them in .env, your shell, or HF Secrets.
|
84 |
+
|
85 |
+
ποΈ Local data
|
86 |
+
mcp/data/drugbank_open_structured_drug_links.tsv β DrugBank Open Data
|
87 |
+
Download from the DrugBank Open-Data page and place it here.
|
88 |
+
|
89 |
+
The file is lazy-loaded and cached; the app still works without it.
|
90 |
+
|
91 |
+
π§ͺ Tests
|
92 |
+
bash
|
93 |
+
Copy
|
94 |
+
Edit
|
95 |
+
pytest tests/
|
96 |
+
Unit tests mock external APIs and verify parsing, caching and orchestrator merges.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
97 |
|
98 |
+
π οΈ Contributing
|
99 |
+
Fork & create a feature branch.
|
100 |
|
101 |
+
Follow Conventional Commits for PR titles.
|
102 |
|
103 |
+
Run pre-commit install to auto-format with black & ruff.
|
104 |
|
105 |
+
Submit a PR; GitHub Actions will run lint + tests.
|
106 |
|
107 |
+
π License
|
108 |
+
Apache 2.0 β free for research and commercial use.
|
109 |
+
API terms of each external provider still apply.
|
110 |
|
111 |
+
Happy discovering!
|
|
|
|