ncbi
/

Safetensors
qwen2
biology
bioinformatics
single-cell
Fangyinfff commited on
Commit
8f8789f
·
verified ·
1 Parent(s): 830ebb4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -2
README.md CHANGED
@@ -2,8 +2,18 @@
2
  license: apache-2.0
3
  ---
4
 
 
5
 
6
- ## 🔬 How to Run Inference
 
 
 
 
 
 
 
 
 
7
 
8
  The following example shows how to use `ncbi/Cell-o1` with structured input for reasoning-based cell type annotation.
9
  The model expects both a system message and a user prompt containing multiple cells and candidate cell types.
@@ -56,4 +66,22 @@ response = generator(
56
  # 5. Print the model’s reply (<think> + <answer>)
57
  assistant_reply = response[-1]["content"] if isinstance(response, list) else response
58
  print(assistant_reply)
59
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  ---
4
 
5
+ # Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning
6
 
7
+ > [!Note]
8
+ > Please refer to our [repository](https://github.com/ncbi-nlp/cell-o1) and [paper](https://www.arxiv.org/abs/2506.02911) for more details.
9
+
10
+ ## 🧠 Overview
11
+ Cell type annotation is a key task in analyzing the heterogeneity of single-cell RNA sequencing data. Although recent foundation models automate this process, they typically annotate cells independently, without considering batch-level cellular context or providing explanatory reasoning. In contrast, human experts often annotate distinct cell types for different cell clusters based on their domain knowledge.
12
+ To mimic this expert behavior, we introduce ***CellPuzzles***—a benchmark requiring unique cell-type assignments across cell batches. Existing LLMs struggle with this task, with the best baseline (OpenAI's o1) achieving only 19.0% batch accuracy. To address this, we present ***Cell-o1***, a reasoning-enhanced language model trained via SFT on distilled expert traces, followed by RL with batch-level rewards. ***Cell-o1*** outperforms all baselines on both cell-level and batch-level metrics, and exhibits emergent behaviors such as self-reflection and curriculum reasoning, offering insights into its interpretability and generalization.
13
+
14
+
15
+
16
+ ## 🚀 How to Run Inference
17
 
18
  The following example shows how to use `ncbi/Cell-o1` with structured input for reasoning-based cell type annotation.
19
  The model expects both a system message and a user prompt containing multiple cells and candidate cell types.
 
66
  # 5. Print the model’s reply (<think> + <answer>)
67
  assistant_reply = response[-1]["content"] if isinstance(response, list) else response
68
  print(assistant_reply)
69
+ ```
70
+
71
+
72
+ ## 🔖 Citation
73
+
74
+ If you use our repository, please cite the following related paper:
75
+
76
+ ```
77
+ @article{fang2025cello1,
78
+ title={Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning},
79
+ author={Fang, Yin and Jin, Qiao and Xiong, Guangzhi and Jin, Bowen and Zhong, Xianrui and Ouyang, Siru and Zhang, Aidong and Han, Jiawei and Lu, Zhiyong},
80
+ journal={arXiv preprint arXiv:2506.02911},
81
+ year={2025}
82
+ }
83
+ ```
84
+
85
+ ## 🫱🏻‍🫲 Acknowledgements
86
+
87
+ This research was supported by the Division of Intramural Research (DIR) of the National Library of Medicine (NLM), National Institutes of Health.