Update docs.md
Browse files
docs.md
CHANGED
@@ -24,12 +24,13 @@
|
|
24 |
|
25 |
<h2>π Background</h2>
|
26 |
<p>Recent advances in <strong>Large Language Models (LLMs)</strong> have demonstrated transformative potential in <strong>healthcare</strong>, yet concerns remain around their reliability and clinical validity across diverse clinical tasks, specialties, and languages. To support timely and trustworthy evaluation, building upon our <a href="https://ai.nejm.org/doi/full/10.1056/AIra2400012">systematic review</a> of global clinical text resources, we introduce <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a>, <strong>a multilingual benchmark that comprises 87 real-world clinical text tasks spanning nine languages and more than one million samples</strong>. Furthermore, we construct this leaderboard of LLM in clinical text understanding by systematically evaluating <strong>52 state-of-the-art LLMs</strong> (by 2025/04/28).</p>
|
|
|
27 |
|
28 |
|
29 |
-
<div style="display: flex; align-items: center; justify-content: center; width: 100%; height:
|
30 |
<img
|
31 |
-
src="https://cdn-uploads.huggingface.co/production/uploads/
|
32 |
-
alt="
|
33 |
style="max-width: 80%; max-height: 100%; object-fit: contain;"
|
34 |
/>
|
35 |
</div>
|
@@ -44,7 +45,7 @@
|
|
44 |
</ul>
|
45 |
<p>In addition, BRIDGE offers multiple <strong>model filters</strong> and <strong>task filters</strong> to enable users to explore LLM performance across <strong>different clinical contexts</strong>, empowering researchers and clinicians to make informed decisions and track model advancements over time.</p>
|
46 |
|
47 |
-
<div style="display: flex; align-items: center; justify-content: center; width: 100%; height:
|
48 |
<img
|
49 |
src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/xpyabfXWqacZD-ThQ5guU.jpeg"
|
50 |
alt="HMS"
|
@@ -87,6 +88,11 @@ Importantly, all 87 datasets have been verified to be either fully open-access o
|
|
87 |
</ul>
|
88 |
We will review and evaluate your submission and update the leaderboard accordingly.
|
89 |
|
|
|
|
|
|
|
|
|
|
|
90 |
|
91 |
<h2>π€ Contributing</h2>
|
92 |
<p>We welcome and greatly value contributions and collaborations from the community!
|
@@ -94,23 +100,27 @@ If you have clinical text datasets that you would like to share for broader expl
|
|
94 |
<p>We are committed to expanding BRIDGE while strictly adhering to appropriate data use agreements and ethical guidelines. Let's work together to advance the responsible application of LLMs in medicine!</p>
|
95 |
|
96 |
|
97 |
-
<h2>π’ Updates</h2>
|
98 |
-
<ul>
|
99 |
-
<li>ποΈ 2025/04/28: BRIDGE Leaderboard V1.0.0 is now live!</li>
|
100 |
-
<li>ποΈ 2025/04/28: Our paper <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a> is now available on arXiv!</li>
|
101 |
-
</ul>
|
102 |
|
103 |
-
<h2>π¬ Contact Information</h2>
|
104 |
-
<p>If you have any questions about BRIDGE or the leaderboard, feel free to reach out!</p>
|
105 |
-
<ul>
|
106 |
-
<li><strong>Leaderboard Managers</strong>: Jiageng Wu ([email protected]), Kevin Xie ([email protected])</li>
|
107 |
-
<li><strong>Benchmark Managers</strong>: Jiageng Wu ([email protected]), Bowen Gu ([email protected])</li>
|
108 |
-
<li><strong>Program Lead</strong>: Jie Yang ([email protected])</li>
|
109 |
-
</ul>
|
110 |
|
111 |
-
|
112 |
-
|
113 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
114 |
title={BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text},
|
115 |
author={Wu, Jiageng and Gu, Bowen and Zhou, Ren and Xie, Kevin and Snyder, Doug and Jiang, Yixing and Carducci, Valentina and Wyss, Richard and Desai, Rishi J and Alsentzer, Emily and Celi, Leo Anthony and Rodman, Adam and Schneeweiss, Sebastian and Chen, Jonathan H. and Romero-Brufau, Santiago and Lin, Kueiyu Joshua and Yang, Jie},
|
116 |
year={2025},
|
@@ -129,4 +139,6 @@ If you have clinical text datasets that you would like to share for broader expl
|
|
129 |
year={2024},
|
130 |
publisher={Massachusetts Medical Society}
|
131 |
}
|
132 |
-
</code></pre>
|
|
|
|
|
|
24 |
|
25 |
<h2>π Background</h2>
|
26 |
<p>Recent advances in <strong>Large Language Models (LLMs)</strong> have demonstrated transformative potential in <strong>healthcare</strong>, yet concerns remain around their reliability and clinical validity across diverse clinical tasks, specialties, and languages. To support timely and trustworthy evaluation, building upon our <a href="https://ai.nejm.org/doi/full/10.1056/AIra2400012">systematic review</a> of global clinical text resources, we introduce <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a>, <strong>a multilingual benchmark that comprises 87 real-world clinical text tasks spanning nine languages and more than one million samples</strong>. Furthermore, we construct this leaderboard of LLM in clinical text understanding by systematically evaluating <strong>52 state-of-the-art LLMs</strong> (by 2025/04/28).</p>
|
27 |
+
This project is led and maintained by the team of <a href="https://ylab.top/">Prof. Jie Yang</a> and <a href="https://www.drugepi.org/team/joshua-kueiyu-lin">Prof. Kueiyu Joshua Lin</a> at Harvard Medical School and Brigham and Women's Hospital.
|
28 |
|
29 |
|
30 |
+
<div style="display: flex; align-items: center; justify-content: center; width: 100%; height: auto;">
|
31 |
<img
|
32 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/633c70c4ccce04161f841c30/OLN3J8_Yq8dx_LrgjYSsC.png"
|
33 |
+
alt="dataset"
|
34 |
style="max-width: 80%; max-height: 100%; object-fit: contain;"
|
35 |
/>
|
36 |
</div>
|
|
|
45 |
</ul>
|
46 |
<p>In addition, BRIDGE offers multiple <strong>model filters</strong> and <strong>task filters</strong> to enable users to explore LLM performance across <strong>different clinical contexts</strong>, empowering researchers and clinicians to make informed decisions and track model advancements over time.</p>
|
47 |
|
48 |
+
<div style="display: flex; align-items: center; justify-content: center; width: 100%; height: auto;">
|
49 |
<img
|
50 |
src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/xpyabfXWqacZD-ThQ5guU.jpeg"
|
51 |
alt="HMS"
|
|
|
88 |
</ul>
|
89 |
We will review and evaluate your submission and update the leaderboard accordingly.
|
90 |
|
91 |
+
<h2>π’ Updates</h2>
|
92 |
+
<ul>
|
93 |
+
<li>ποΈ 2025/04/28: BRIDGE Leaderboard V1.0.0 is now live!</li>
|
94 |
+
<li>ποΈ 2025/04/28: Our paper <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a> is now available on arXiv!</li>
|
95 |
+
</ul>
|
96 |
|
97 |
<h2>π€ Contributing</h2>
|
98 |
<p>We welcome and greatly value contributions and collaborations from the community!
|
|
|
100 |
<p>We are committed to expanding BRIDGE while strictly adhering to appropriate data use agreements and ethical guidelines. Let's work together to advance the responsible application of LLMs in medicine!</p>
|
101 |
|
102 |
|
|
|
|
|
|
|
|
|
|
|
103 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
104 |
|
105 |
+
## π Donation
|
106 |
+
|
107 |
+
BRIDGE is a non-profit, researcher-led benchmark that requires substantial resources (e.g., high-performance GPUs, a dedicated team) to sustain. To support open and impactful academic research that advances clinical care, we welcome your contributions. Please contact Prof. Jie Yang at <[email protected]> to discuss donation opportunities.</p>
|
108 |
+
|
109 |
+
## π¬ Contact Information
|
110 |
+
|
111 |
+
If you have any questions about BRIDGE or the leaderboard, feel free to reach out!
|
112 |
+
- **Leaderboard Managers**: Jiageng Wu (<[email protected]>), Kevin Xie (<[email protected]>), Bowen Gu (<[email protected]>)
|
113 |
+
- **Benchmark Managers**: Jiageng Wu (<[email protected]>), Bowen Gu (<[email protected]>)
|
114 |
+
- **Project Lead**: Jie Yang (<[email protected]>)
|
115 |
+
|
116 |
+
|
117 |
+
|
118 |
+
|
119 |
+
## π Citation
|
120 |
+
|
121 |
+
If you find this leaderboard useful for your research and applications, please cite the following papers:
|
122 |
+
<pre style="white-space: pre-wrap; overflow-wrap: anywhere;">
|
123 |
+
<code>@article{BRIDGE-benchmark,
|
124 |
title={BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text},
|
125 |
author={Wu, Jiageng and Gu, Bowen and Zhou, Ren and Xie, Kevin and Snyder, Doug and Jiang, Yixing and Carducci, Valentina and Wyss, Richard and Desai, Rishi J and Alsentzer, Emily and Celi, Leo Anthony and Rodman, Adam and Schneeweiss, Sebastian and Chen, Jonathan H. and Romero-Brufau, Santiago and Lin, Kueiyu Joshua and Yang, Jie},
|
126 |
year={2025},
|
|
|
139 |
year={2024},
|
140 |
publisher={Massachusetts Medical Society}
|
141 |
}
|
142 |
+
</code></pre>
|
143 |
+
|
144 |
+
If you use the datasets in BRIDGE, please also cite the original paper of datasets, which can be found in the our BRIDGE paper.
|