File size: 7,861 Bytes
fcf2b60
 
 
f371400
 
673e6c5
f371400
 
 
 
29019b4
 
 
 
 
 
f371400
f4fa1d2
8f4c3a5
 
 
 
 
 
92ef534
8f4c3a5
 
 
 
 
 
 
 
 
92ef534
8f4c3a5
 
 
 
92ef534
 
1682023
8f4c3a5
1682023
6a4fc04
1682023
 
 
 
6a4fc04
1682023
 
 
 
6a4fc04
f4fa1d2
 
 
92ef534
 
 
 
 
 
 
 
 
 
6fe96dd
 
f371400
f2b289e
873b4f8
6fe96dd
92ef534
 
 
 
 
873b4f8
29019b4
 
 
 
089d33e
 
8f4c3a5
089d33e
 
 
 
 
 
 
 
 
 
 
 
 
4d2b042
089d33e
 
 
92ef534
089d33e
49e8662
 
 
 
 
 
 
 
 
 
 
089d33e
b863e37
30fb361
 
 
 
 
 
 
 
b863e37
30fb361
 
 
 
 
 
 
 
b863e37
873b4f8
6fe96dd
92ef534
 
 
 
 
 
 
 
 
 
089d33e
 
 
 
 
 
 
 
 
 
92ef534
aa6dc6d
8f4c3a5
 
 
 
 
fcf2b60
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
import gradio as gr


with gr.Blocks(css="""
#my-img img {
    width: 70% !important;
    display: block;
    margin-left: auto;
    margin-right: auto;
}
#small img {
    width: 40% !important;
    display: block;
    margin-left: auto;
    margin-right: auto;
}
""") as demo:
    gr.HTML("""
<div align="center" style="padding: 10px;">
  <a href="#part1" style="margin-right: 20px; font-size: 18px;">🔹 Part 1: Elastic Reasoning</a>
  <a href="#part2" style="font-size: 18px;">🔹 Part 2: Fractured CoT</a>
</div>
    """)
    gr.HTML("""
<div align="center">
<h1 id="top">Efficient Reasoning</h1>
<p>
  This demo is structured in two parts, each showcasing a recent advancement in scalable and efficient reasoning with large language models:
  <br><br>
  <b>Part 1:</b> <i>Elastic Reasoning</i> focuses on budget-aware generation by explicitly separating thinking and solution stages.
  <br>
  <b>Part 2:</b> <i>Fractured Chain-of-Thought</i> explores sampling efficiency by fragmenting the reasoning process along multiple dimensions.
</p>
<br>
</div>
    """)
    gr.HTML("""
<div align="center">
<h3 id="part1">Part 1: Elastic Reasoning 🌟</h3>
<br>
</div>
    """)

    gr.HTML("""
<div style="display: flex; justify-content: center; gap: 8px; flex-wrap: wrap;">
  <a href="https://arxiv.org/pdf/2505.05315">
    <img src="https://img.shields.io/badge/paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" />
  </a>
  <a href="https://huggingface.co/collections/Salesforce/elastic-reasoning-682b4bba108d6ea0a8bab275">
    <img src="https://img.shields.io/badge/E1-fcd022?style=for-the-badge&logo=huggingface&logoColor=000&labelColor" />
  </a>
  <a href="https://github.com/SalesforceAIResearch/Elastic-Reasoning">
    <img src="https://img.shields.io/badge/Elastic_Reasoning-000000?style=for-the-badge&logo=github&logoColor=white" />
  </a>
</div>
    """)
    gr.Markdown(
    """

## Introduction
We propose **Elastic Reasoning**, a novel framework for scalable chain of thoughts
that explicitly separates reasoning into two phases—`thinking and solution`—with
independently allocated budgets. At test time, Elastic Reasoning prioritize that
completeness of solution segments, significantly improving reliability under tight
resource constraints. To train models that are robust to truncated thinking, we
introduce a lightweight `budget-constrained rollout` strategy, integrated into GRPO,
which teaches the model to reason adaptively when the thinking process is cut
short and generalizes effectively to unseen budget constraints without additional
training.
    """)
    gr.Image("figs/framework.png", label="Framework", show_label=False, elem_id="my-img")

    gr.Markdown(
    """
**Main Takeaways**
1. ✂️ Thinking + Solution are explicitly separated with independent budgets — boosting reliability under tight compute constraints.
2. 🧠 Budget-Constrained Rollout: We train models to handle truncated reasoning using GRPO.
3. 📈 Flexible scalability: Robust performance across diverse inference budgets on reasoning benchmarks like AIME and LiveCodeBench.
4. ⚙️ Better performance with fewer tokens: Our trained model generates outputs that are 30% shorter while maintaining (or even improving) accuracy.
    """)
    with gr.Row():
        gr.Image("figs/aime.png", label="Framework", show_label=False, elem_id="small")
        gr.Image("figs/livecode.png", label="Framework", show_label=False, elem_id="small")
    gr.Image("figs/codetable.png", label="Framework", show_label=False, elem_id="my-img")
    gr.HTML("""
<div align="center">
<h3 id="part2">Part 2: Fractured Chain-of-Thought 🌟</h3>
<br>
</div>
    """)
    gr.HTML("""
<div style="display: flex; justify-content: center; gap: 8px; flex-wrap: wrap;">
  <a href="https://arxiv.org/pdf/2505.12992">
    <img src="https://img.shields.io/badge/paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" />
  </a>
  <a href="https://github.com/BaohaoLiao/frac-cot">
    <img src="https://img.shields.io/badge/frac-cot-000000?style=for-the-badge&logo=github&logoColor=white" />
  </a>
</div>
    """)
    gr.Image("figs/frac_cot.gif", label="Framework", show_label=False, elem_id="my-img")

    gr.Markdown(
    """

## Introduction
Building upon the same core insight as **Elastic Reasoning**—that correct answers can often be derived without waiting for a full chain-of-thought (CoT)—**Fractured Sampling** shifts focus to the **sampling strategy** of reasoning.

Instead of relying on complete, uninterrupted reasoning sequences, Fractured Sampling **breaks the CoT along the temporal dimension**, exploring whether it's possible to "get the right answer without thinking all the way through."

This method introduces sampling control along three key dimensions:

- **Solution Diversity (m) — sampling multiple final outputs from a single reasoning trace.
- **Trajectory Diversity (n) — sampling multiple independent reasoning traces with different seeds (vanilla CoT sampling).
- **Reasoning Depth Diversity (H) — sampling at different intermediate stages of a single reasoning trace.

Among these, the novel **reasoning depth `H`** plays a critical role: by sampling outputs at different depths of partially completed reasoning chains, the model creates multiple sets of "fragmented thoughts + solutions," which are then jointly evaluated to select the most trustworthy outcome.
    """)
    gr.Image("figs/frac-frame.png", label="Framework", show_label=False, elem_id="my-img")
    gr.Markdown(
    """
### 🔍 Scaling Analysis of *n*, *m*, and *H* in DeepSeek-R1 Models

A detailed test-time scaling analysis on the DeepSeek-R1 series reveals the individual impact of the three sampling dimensions: `n` (number of reasoning paths), `m` (number of answers per path), and `H` (depth-wise reasoning samples).

Across multiple reasoning benchmarks, the results show that increasing **`H` — sampling across reasoning depths — yields the highest cost-effectiveness**. That is, sampling more intermediate answers along the depth of a single reasoning path leads to **greater accuracy improvements with fewer additional tokens**, compared to simply increasing the number of paths (`n`) or answers (`m`).
    """)
    gr.Image("figs/single.png", label="Framework", show_label=False, elem_id="my-img")
    gr.Markdown(
    """
### 🔄 Joint Sampling of *n*, *m*, and *H* for Enhanced Accuracy

In practical scenarios, the sampling dimensions `n`, `m`, and `H` can be **jointly optimized** rather than tuned in isolation. By **dynamically allocating the sampling budget across these dimensions**, the model can significantly enhance its reasoning accuracy.

This joint sampling strategy leverages the complementary strengths of each dimension—diversity (`n`), redundancy (`m`), and depth-awareness (`H`)—to achieve robust performance under a fixed token budget.
    """)
    gr.Image("figs/combine.png", label="Framework", show_label=False, elem_id="my-img")
    gr.Markdown(
    """
## Citation


```bibtex
@article{xu2025scalable,
  title={Scalable Chain of Thoughts via Elastic Reasoning},
  author={Xu, Yuhui and Dong, Hanze and Wang, Lei and Sahoo, Doyen and Li, Junnan and Xiong, Caiming},
  journal={arXiv preprint arXiv:2505.05315},
  year={2025}
}

@misc{liao2025fracturedchainofthoughtreasoning,
      title={Fractured Chain-of-Thought Reasoning}, 
      author={Baohao Liao and Hanze Dong and Yuhui Xu and Doyen Sahoo and Christof Monz and Junnan Li and Caiming Xiong},
      year={2025},
      eprint={2505.12992},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.12992}, 
}
```
    """)
    gr.HTML("""
<div align="center" style="margin-top: 30px;">
  <a href="#top" style="font-size: 16px;">⬆️ Back to Top</a>
</div>
    """)


if __name__ == "__main__":
    demo.launch()