jiagengwu commited on
Commit
ea9676f
Β·
verified Β·
1 Parent(s): dbd134d

Update docs.md

Browse files
Files changed (1) hide show
  1. docs.md +32 -20
docs.md CHANGED
@@ -24,12 +24,13 @@
24
 
25
  <h2>πŸ“œ Background</h2>
26
  <p>Recent advances in <strong>Large Language Models (LLMs)</strong> have demonstrated transformative potential in <strong>healthcare</strong>, yet concerns remain around their reliability and clinical validity across diverse clinical tasks, specialties, and languages. To support timely and trustworthy evaluation, building upon our <a href="https://ai.nejm.org/doi/full/10.1056/AIra2400012">systematic review</a> of global clinical text resources, we introduce <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a>, <strong>a multilingual benchmark that comprises 87 real-world clinical text tasks spanning nine languages and more than one million samples</strong>. Furthermore, we construct this leaderboard of LLM in clinical text understanding by systematically evaluating <strong>52 state-of-the-art LLMs</strong> (by 2025/04/28).</p>
 
27
 
28
 
29
- <div style="display: flex; align-items: center; justify-content: center; width: 100%; height: 500px;">
30
  <img
31
- src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/j5tJ9xh3t6U1JlqGbKbrj.png"
32
- alt="HMS"
33
  style="max-width: 80%; max-height: 100%; object-fit: contain;"
34
  />
35
  </div>
@@ -44,7 +45,7 @@
44
  </ul>
45
  <p>In addition, BRIDGE offers multiple <strong>model filters</strong> and <strong>task filters</strong> to enable users to explore LLM performance across <strong>different clinical contexts</strong>, empowering researchers and clinicians to make informed decisions and track model advancements over time.</p>
46
 
47
- <div style="display: flex; align-items: center; justify-content: center; width: 100%; height: 450px;">
48
  <img
49
  src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/xpyabfXWqacZD-ThQ5guU.jpeg"
50
  alt="HMS"
@@ -87,6 +88,11 @@ Importantly, all 87 datasets have been verified to be either fully open-access o
87
  </ul>
88
  We will review and evaluate your submission and update the leaderboard accordingly.
89
 
 
 
 
 
 
90
 
91
  <h2>🀝 Contributing</h2>
92
  <p>We welcome and greatly value contributions and collaborations from the community!
@@ -94,23 +100,27 @@ If you have clinical text datasets that you would like to share for broader expl
94
  <p>We are committed to expanding BRIDGE while strictly adhering to appropriate data use agreements and ethical guidelines. Let's work together to advance the responsible application of LLMs in medicine!</p>
95
 
96
 
97
- <h2>πŸ“’ Updates</h2>
98
- <ul>
99
- <li>πŸ—“οΈ 2025/04/28: BRIDGE Leaderboard V1.0.0 is now live!</li>
100
- <li>πŸ—“οΈ 2025/04/28: Our paper <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a> is now available on arXiv!</li>
101
- </ul>
102
 
103
- <h2>πŸ“¬ Contact Information</h2>
104
- <p>If you have any questions about BRIDGE or the leaderboard, feel free to reach out!</p>
105
- <ul>
106
- <li><strong>Leaderboard Managers</strong>: Jiageng Wu ([email protected]), Kevin Xie ([email protected])</li>
107
- <li><strong>Benchmark Managers</strong>: Jiageng Wu ([email protected]), Bowen Gu ([email protected])</li>
108
- <li><strong>Program Lead</strong>: Jie Yang ([email protected])</li>
109
- </ul>
110
 
111
- <h2>πŸ“š Citation</h2>
112
- <p>If you find this leaderboard useful for your research and applications, please cite the following papers:</p>
113
- <pre><code>@article{BRIDGE-benchmark,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
  title={BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text},
115
  author={Wu, Jiageng and Gu, Bowen and Zhou, Ren and Xie, Kevin and Snyder, Doug and Jiang, Yixing and Carducci, Valentina and Wyss, Richard and Desai, Rishi J and Alsentzer, Emily and Celi, Leo Anthony and Rodman, Adam and Schneeweiss, Sebastian and Chen, Jonathan H. and Romero-Brufau, Santiago and Lin, Kueiyu Joshua and Yang, Jie},
116
  year={2025},
@@ -129,4 +139,6 @@ If you have clinical text datasets that you would like to share for broader expl
129
  year={2024},
130
  publisher={Massachusetts Medical Society}
131
  }
132
- </code></pre>
 
 
 
24
 
25
  <h2>πŸ“œ Background</h2>
26
  <p>Recent advances in <strong>Large Language Models (LLMs)</strong> have demonstrated transformative potential in <strong>healthcare</strong>, yet concerns remain around their reliability and clinical validity across diverse clinical tasks, specialties, and languages. To support timely and trustworthy evaluation, building upon our <a href="https://ai.nejm.org/doi/full/10.1056/AIra2400012">systematic review</a> of global clinical text resources, we introduce <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a>, <strong>a multilingual benchmark that comprises 87 real-world clinical text tasks spanning nine languages and more than one million samples</strong>. Furthermore, we construct this leaderboard of LLM in clinical text understanding by systematically evaluating <strong>52 state-of-the-art LLMs</strong> (by 2025/04/28).</p>
27
+ This project is led and maintained by the team of <a href="https://ylab.top/">Prof. Jie Yang</a> and <a href="https://www.drugepi.org/team/joshua-kueiyu-lin">Prof. Kueiyu Joshua Lin</a> at Harvard Medical School and Brigham and Women's Hospital.
28
 
29
 
30
+ <div style="display: flex; align-items: center; justify-content: center; width: 100%; height: auto;">
31
  <img
32
+ src="https://cdn-uploads.huggingface.co/production/uploads/633c70c4ccce04161f841c30/OLN3J8_Yq8dx_LrgjYSsC.png"
33
+ alt="dataset"
34
  style="max-width: 80%; max-height: 100%; object-fit: contain;"
35
  />
36
  </div>
 
45
  </ul>
46
  <p>In addition, BRIDGE offers multiple <strong>model filters</strong> and <strong>task filters</strong> to enable users to explore LLM performance across <strong>different clinical contexts</strong>, empowering researchers and clinicians to make informed decisions and track model advancements over time.</p>
47
 
48
+ <div style="display: flex; align-items: center; justify-content: center; width: 100%; height: auto;">
49
  <img
50
  src="https://cdn-uploads.huggingface.co/production/uploads/67a040fb6934f9aa1c866f99/xpyabfXWqacZD-ThQ5guU.jpeg"
51
  alt="HMS"
 
88
  </ul>
89
  We will review and evaluate your submission and update the leaderboard accordingly.
90
 
91
+ <h2>πŸ“’ Updates</h2>
92
+ <ul>
93
+ <li>πŸ—“οΈ 2025/04/28: BRIDGE Leaderboard V1.0.0 is now live!</li>
94
+ <li>πŸ—“οΈ 2025/04/28: Our paper <a href="https://arxiv.org/abs/2504.19467">BRIDGE</a> is now available on arXiv!</li>
95
+ </ul>
96
 
97
  <h2>🀝 Contributing</h2>
98
  <p>We welcome and greatly value contributions and collaborations from the community!
 
100
  <p>We are committed to expanding BRIDGE while strictly adhering to appropriate data use agreements and ethical guidelines. Let's work together to advance the responsible application of LLMs in medicine!</p>
101
 
102
 
 
 
 
 
 
103
 
 
 
 
 
 
 
 
104
 
105
+ ## πŸš€ Donation
106
+
107
+ BRIDGE is a non-profit, researcher-led benchmark that requires substantial resources (e.g., high-performance GPUs, a dedicated team) to sustain. To support open and impactful academic research that advances clinical care, we welcome your contributions. Please contact Prof. Jie Yang at <[email protected]> to discuss donation opportunities.</p>
108
+
109
+ ## πŸ“¬ Contact Information
110
+
111
+ If you have any questions about BRIDGE or the leaderboard, feel free to reach out!
112
+ - **Leaderboard Managers**: Jiageng Wu (<[email protected]>), Kevin Xie (<[email protected]>), Bowen Gu (<[email protected]>)
113
+ - **Benchmark Managers**: Jiageng Wu (<[email protected]>), Bowen Gu (<[email protected]>)
114
+ - **Project Lead**: Jie Yang (<[email protected]>)
115
+
116
+
117
+
118
+
119
+ ## πŸ“š Citation
120
+
121
+ If you find this leaderboard useful for your research and applications, please cite the following papers:
122
+ <pre style="white-space: pre-wrap; overflow-wrap: anywhere;">
123
+ <code>@article{BRIDGE-benchmark,
124
  title={BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text},
125
  author={Wu, Jiageng and Gu, Bowen and Zhou, Ren and Xie, Kevin and Snyder, Doug and Jiang, Yixing and Carducci, Valentina and Wyss, Richard and Desai, Rishi J and Alsentzer, Emily and Celi, Leo Anthony and Rodman, Adam and Schneeweiss, Sebastian and Chen, Jonathan H. and Romero-Brufau, Santiago and Lin, Kueiyu Joshua and Yang, Jie},
126
  year={2025},
 
139
  year={2024},
140
  publisher={Massachusetts Medical Society}
141
  }
142
+ </code></pre>
143
+
144
+ If you use the datasets in BRIDGE, please also cite the original paper of datasets, which can be found in the our BRIDGE paper.