liuwenhan
/

reasonrank-32B

@@ -1,21 +1,139 @@
 ---
-license: mit
 datasets:
 - liuwenhan/reasonrank_data_sft
 - liuwenhan/reasonrank_data_rl
 - liuwenhan/reasonrank_data_13k
 language:
 - en
-base_model:
-- Qwen/Qwen2.5-32B-Instruct
 ---
 ## Introduction
-This is the model trained in our paper: ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability ([📝arXiv](https://arxiv.org/abs/2508.07050)). Please refer our [🧩github repository](https://github.com/8421BCD/ReasonRank) for the usage of reasonrank-32B.
 ## Model Performance
 <p align="center">
 <img width="90%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250810163757771.png" />
 </p>
-🌹 If you use this model, please ✨star our <a href="https://github.com/8421BCD/reasonrank" target="_blank">GitHub repository</a> to support us. Your star means a lot!

 ---
+base_model:
+- Qwen/Qwen2.5-32B-Instruct
 datasets:
 - liuwenhan/reasonrank_data_sft
 - liuwenhan/reasonrank_data_rl
 - liuwenhan/reasonrank_data_13k
 language:
 - en
+license: mit
+pipeline_tag: text-ranking
+library_name: transformers
+tags:
+- reranking
+- reasoning
+- qwen
 ---
+# ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
 ## Introduction
+This is the model trained in our paper: **ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability** ([📝arXiv](https://arxiv.org/abs/2508.07050)).
+Large Language Model (LLM) based listwise ranking has shown superior performance in many passage ranking tasks. With the development of Large Reasoning Models, many studies have demonstrated that step-by-step reasoning during test-time helps improve listwise ranking performance. ReasonRank addresses the scarcity of reasoning-intensive training data by proposing an automated reasoning-intensive training data synthesis framework. To empower the listwise reranker with strong reasoning ability, we further propose a two-stage post-training approach, which includes a cold-start supervised fine-tuning (SFT) stage for reasoning pattern learning and a reinforcement learning (RL) stage for further ranking ability enhancement.
+Please refer to our [🧩GitHub repository](https://github.com/8421BCD/ReasonRank) for detailed usage instructions and code.
+Project page: [https://brightbenchmark.github.io/](https://brightbenchmark.github.io/)
 ## Model Performance
 <p align="center">
 <img width="90%" alt="image" src="https://8421bcd.oss-cn-beijing.aliyuncs.com/img/image-20250810163757771.png" />
 </p>
+## Sample Usage
+You can use this model with the `transformers` library. Here is a basic example to perform inference. Note that the exact prompt construction for ReasonRank is critical for performance and should ideally follow the `create_prompt` function in the original [GitHub repository's `rerank/rank_listwise_os_llm.py` file](https://github.com/8421BCD/ReasonRank/blob/main/rerank/rank_listwise_os_llm.py). The example below provides a simplified structure for demonstration.
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
+import torch
+# Load the model and tokenizer
+model_id = "liuwenhan/reasonrank-32B" # Assuming this is the model being documented
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16, # or torch.float16 depending on your GPU and needs
+    device_map="auto",
+    trust_remote_code=True # Required for custom modeling files (Qwen components)
+).eval()
+tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+# Example query and passages
+query = "What is the capital of France?"
+passages = [
+    "Paris is the capital and most populous city of France.",
+    "London is the capital of England and the United Kingdom.",
+    "The Eiffel Tower is a famous landmark in Paris.",
+    "France is a country in Western Europe."
+]
+# Construct the input messages for Qwen's chat template.
+# For ReasonRank's specific prompt structure, refer to the original GitHub repository's
+# `rerank/rank_listwise_os_llm.py` file and `add_prefix_prompt`/`add_post_prompt` functions.
+# This example uses a general Qwen-like structure for demonstration.
+system_prompt = "You are a helpful and intelligent assistant."
+user_prefix = f"For the query: '{query}', please rank the following passages from most relevant to least relevant.\
+"
+passage_list_str = "\
+".join([f"[{i+1}] {p}" for i, p in enumerate(passages)])
+user_suffix = "\
+Now, please generate the reasoning process and the ranked list of passages."
+messages = [
+    {"role": "system", "content": system_prompt},
+    {"role": "user", "content": f"{user_prefix}{passage_list_str}{user_suffix}"}
+]
+# Apply the chat template to get the final prompt string
+prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+# Tokenize the input
+inputs = tokenizer(prompt, return_tensors="pt", padding=True).to(model.device)
+# Generate response
+# Use generation_config from the model if available, otherwise define
+generation_config = model.generation_config if model.generation_config else GenerationConfig()
+generation_config.max_new_tokens = 512
+generation_config.do_sample = False # For greedy decoding
+generation_config.temperature = 0.1 # Keep temperature low for ranking tasks
+generation_config.top_p = 0.95
+with torch.no_grad():
+    outputs = model.generate(
+        **inputs,
+        generation_config=generation_config
+    )
+# Decode the output
+response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
+print(f"Query: {query}\
+Response:\
+{response}")
+# Expected (simplified) output might look like:
+# Response:
+# Reasoning: The query asks for the capital of France. Passage [1] directly states "Paris is the capital and most populous city of France."
+# This makes it the most relevant. Other passages are less direct or irrelevant.
+# Ranked List:
+# 1. [1] Paris is the capital and most populous city of France.
+# 2. [3] The Eiffel Tower is a famous landmark in Paris.
+# 3. [4] France is a country in Western Europe.
+# 4. [2] London is the capital of England and the United Kingdom.
+```
+## Citation
+If you find this work helpful, please cite our papers:
+```bibtex
+@misc{liu2025reasonrankempoweringpassageranking,
+      title={ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability},
+      author={Wenhan Liu and Xinyu Ma and Weiwei Sun and Yutao Zhu and Yuchen Li and Dawei Yin and Zhicheng Dou},
+      year={2025},
+      eprint={2508.07050},
+      archivePrefix={arXiv},
+      primaryClass={cs.IR},
+      url={https://arxiv.org/abs/2508.07050},
+}
+```
+## License
+This project is released under the [MIT License](https://opensource.org/licenses/MIT).
+## Acknowledgement
+The inference codes and training implementation build upon [RankLLM](https://github.com/castorini/rank_llm), [Llama Factory](https://github.com/hiyouga/LLaMA-Factory) and [verl](https://github.com/volcengine/verl). Our work is based on the [Qwen2.5](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) model series, and we sincerely thank the Qwen team for their outstanding contributions to the open-source community.