Update README.md
Browse files
README.md
CHANGED
@@ -2,8 +2,18 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
|
|
5 |
|
6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
|
8 |
The following example shows how to use `ncbi/Cell-o1` with structured input for reasoning-based cell type annotation.
|
9 |
The model expects both a system message and a user prompt containing multiple cells and candidate cell types.
|
@@ -56,4 +66,22 @@ response = generator(
|
|
56 |
# 5. Print the model’s reply (<think> + <answer>)
|
57 |
assistant_reply = response[-1]["content"] if isinstance(response, list) else response
|
58 |
print(assistant_reply)
|
59 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
+
# Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning
|
6 |
|
7 |
+
> [!Note]
|
8 |
+
> Please refer to our [repository](https://github.com/ncbi-nlp/cell-o1) and [paper](https://www.arxiv.org/abs/2506.02911) for more details.
|
9 |
+
|
10 |
+
## 🧠 Overview
|
11 |
+
Cell type annotation is a key task in analyzing the heterogeneity of single-cell RNA sequencing data. Although recent foundation models automate this process, they typically annotate cells independently, without considering batch-level cellular context or providing explanatory reasoning. In contrast, human experts often annotate distinct cell types for different cell clusters based on their domain knowledge.
|
12 |
+
To mimic this expert behavior, we introduce ***CellPuzzles***—a benchmark requiring unique cell-type assignments across cell batches. Existing LLMs struggle with this task, with the best baseline (OpenAI's o1) achieving only 19.0% batch accuracy. To address this, we present ***Cell-o1***, a reasoning-enhanced language model trained via SFT on distilled expert traces, followed by RL with batch-level rewards. ***Cell-o1*** outperforms all baselines on both cell-level and batch-level metrics, and exhibits emergent behaviors such as self-reflection and curriculum reasoning, offering insights into its interpretability and generalization.
|
13 |
+
|
14 |
+
|
15 |
+
|
16 |
+
## 🚀 How to Run Inference
|
17 |
|
18 |
The following example shows how to use `ncbi/Cell-o1` with structured input for reasoning-based cell type annotation.
|
19 |
The model expects both a system message and a user prompt containing multiple cells and candidate cell types.
|
|
|
66 |
# 5. Print the model’s reply (<think> + <answer>)
|
67 |
assistant_reply = response[-1]["content"] if isinstance(response, list) else response
|
68 |
print(assistant_reply)
|
69 |
+
```
|
70 |
+
|
71 |
+
|
72 |
+
## 🔖 Citation
|
73 |
+
|
74 |
+
If you use our repository, please cite the following related paper:
|
75 |
+
|
76 |
+
```
|
77 |
+
@article{fang2025cello1,
|
78 |
+
title={Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning},
|
79 |
+
author={Fang, Yin and Jin, Qiao and Xiong, Guangzhi and Jin, Bowen and Zhong, Xianrui and Ouyang, Siru and Zhang, Aidong and Han, Jiawei and Lu, Zhiyong},
|
80 |
+
journal={arXiv preprint arXiv:2506.02911},
|
81 |
+
year={2025}
|
82 |
+
}
|
83 |
+
```
|
84 |
+
|
85 |
+
## 🫱🏻🫲 Acknowledgements
|
86 |
+
|
87 |
+
This research was supported by the Division of Intramural Research (DIR) of the National Library of Medicine (NLM), National Institutes of Health.
|