Spaces:

bird-of-paradise
/

post-training-techniques-guide

Running

App Files Files Community

bird-of-paradise commited on Jul 27

Commit

cd718f4

verified ·

1 Parent(s): 91ca9b3

first commit

Browse files

Files changed (1) hide show

README.md +35 -6

README.md CHANGED Viewed

@@ -1,8 +1,8 @@
 ---
 title: Post Training Techniques Guide
 emoji: 🚀
-colorFrom: red
-colorTo: red
 sdk: docker
 app_port: 8501
 tags:
@@ -12,9 +12,38 @@ short_description: A visual guide to post-training techniques for LLMs
 license: mit
 ---
-# Welcome to Streamlit!
-Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
-If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
-forums](https://discuss.streamlit.io).

 ---
 title: Post Training Techniques Guide
 emoji: 🚀
+colorFrom: purple
+colorTo: yellow
 sdk: docker
 app_port: 8501
 tags:
 license: mit
 ---
+# 🔧 Beyond Pretraining: A Visual Guide to Post-Training Techniques for LLMs
+This deck summarizes the key trade-offs between different post-training strategies for large language models — including:
+- 📚 Supervised Fine-Tuning (SFT)
+- 🤝 Preference Optimization (DPO, APO, GRPO)
+- 🎯 Reinforcement Learning (PPO)
+It also introduces a reward spectrum from rule-based to subjective feedback, and compares how real-world models like **SmolLM3**, **Tulu 2/3**, and **DeepSeek-R1** implement these strategies.
+> This is a companion resource to my ReTool rollout implementation and blog post.
+>
+> 📖 [Medium blog post](https://medium.com/@jenwei0312/beyond-generate-a-deep-dive-into-stateful-multi-turn-llm-rollouts-for-tool-use-336b00c99ac0)
+> 💻 [ReTool Hugging Face Space](https://huggingface.co/spaces/bird-of-paradise/ReTool-Implementation)
+---
+### 📎 Download the Slides
+👉 [PDF version](https://huggingface.co/spaces/bird-of-paradise/post-training-techniques-guide/blob/main/src/Post%20Training%20Techniques.pdf)
+---
+### 🤝 Reuse & Attribution
+This deck is free to share in talks, posts, or documentation — **with attribution**.
+Please credit:
+**Jen Wei — [Hugging Face 🤗](https://huggingface.co/bird-of-paradise) | [X/Twitter](https://x.com/JenniferWe17599)**
+Optional citation: *“Beyond Pretraining: Post-Training Techniques for LLMs (2025)”*
+Licensed under MIT License.
+—
+*Made with 🧠 by Jen Wei, July 2025*