File size: 4,197 Bytes
451c222 bcfb767 e20dc82 451c222 702623e 116042e 702623e 116042e 702623e 116042e 702623e 116042e 702623e 116042e 702623e 116042e 702623e 116042e 702623e 116042e 702623e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
---
title: MCP Res
emoji: π
colorFrom: red
colorTo: gray
sdk: docker
sdk_version: 1.46.0
app_file: app.py
pinned: false
short_description: Research
---
MedGenesis AI
MedGenesis AI is a biomedical literature discovery workbench that unifies live data from PubMed, arXiv, MyGene.info, ClinicalTrials.gov v2, DisGeNET, openFDA, Open Targets, DrugCentral, UMLS and moreβthen lets you explore the evidence in a rich Streamlit interface powered by OpenAI or Gemini LLMs.
ββββββββββββββββββββββββββββββββββββββββββββββ
β Streamlit UI (app.py) β
β β’ Results / Genes / Trials / Graph tabs β
β β’ PDF / CSV export & follow-up Q&A β
ββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β async calls
ββββββββββββββββΌββββββββββββββββββββββββββββββ
β Orchestrator (mcp/orchestrator.py) β
β β’ pulls PubMed, arXiv β
β β’ keyword extraction (spaCy) β
β β’ fans-out to MyGene, CT.gov v2, UMLSβ¦ β
β β’ merges & summarises with LLM β
ββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β helpers (mcp/*.py)
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββ
β External APIs + local TSV (DrugBank) β
ββββββββββββββββββββββββββββββββββββββββββββββ
π Features
Domain Source / API What you get
Literature PubMed + arXiv titles, abstracts, authors, year
Gene info MyGene.info + NCBI Gene symbol, name, GO, ClinVar, MeSH definitions
Trials ClinicalTrials.gov v2 NCT ID, phase, status, start date
Disease β gene DisGeNET top associations & scores
Drug safety openFDA, DrugCentral adverse events, approvals, MoA
Graph edges Open Targets GraphQL geneβdisease-drug links (+ OT score)
Ontology UMLS, HPO, Wikidata concept CUI, phenotype look-ups
π Quick start
bash
Copy
Edit
# clone repo
git clone https://github.com/your-org/medgenesis.git
cd medgenesis
# build & run locally
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python -m spacy download en_core_web_sm
streamlit run app.py
app.py starts a Streamlit server on localhost:8501.
Enter a biomedical question (e.g. βCRISPR glioblastoma therapyβ) and press Run Search π.
π³ Docker / Hugging Face Space
The included Dockerfile is CPU-only and downloads the spaCy model at build time:
bash
Copy
Edit
docker build -t medgenesis .
docker run -p 7860:7860 -e OPENAI_API_KEY=sk-... medgenesis
HF Spaces: push the repo, set the environment secrets below, and Spaces will pick up Dockerfile.
π Environment variables
Variable Description
OPENAI_API_KEY OpenAI account key (GPT-4o, GPT-4o-mini β¦)
GEMINI_KEY Google Generative AI key (Gemini 1.5 Flash)
UMLS_KEY UMLS Licensing key (ticket auth)
DISGENET_KEY DisGeNET Bearer token (optional)
PUB_KEY NCBI E-utils key (optional, boosts quota)
BIO_KEY NCBI E-utils key for Gene/MeSH (optional)
Set them in .env, your shell, or HF Secrets.
ποΈ Local data
mcp/data/drugbank_open_structured_drug_links.tsv β DrugBank Open Data
Download from the DrugBank Open-Data page and place it here.
The file is lazy-loaded and cached; the app still works without it.
π§ͺ Tests
bash
Copy
Edit
pytest tests/
Unit tests mock external APIs and verify parsing, caching and orchestrator merges.
π οΈ Contributing
Fork & create a feature branch.
Follow Conventional Commits for PR titles.
Run pre-commit install to auto-format with black & ruff.
Submit a PR; GitHub Actions will run lint + tests.
π License
Apache 2.0 β free for research and commercial use.
API terms of each external provider still apply.
Happy discovering! |