Post
87
🧠 Introducing Ellora Recipe #6: Execution-Aware World Model for Qwen3-4B-Thinking
Teaching LLMs to understand not just what code does, but HOW it executes at runtime!
Inspired by Meta's CWM (Code World Model) research, this LoRA adapter adds execution awareness to Qwen3-4B-Thinking-2507. The model learns to predict variable states, trace program execution step-by-step, and debug code by understanding runtime behavior.
🔍 Key Innovation:
We combine Qwen3's native thinking capabilities with real Python execution traces captured via sys.settrace(). The model is trained using GRPO with a custom reward function that scores execution prediction accuracy.
📊 Training Approach:
- Hybrid Magpie-style code generation
- Real execution tracing for ground truth
- Self-supervised learning (no manual annotations!)
- 298 training samples with execution traces
✨ What it does:
- Predicts variable states at each line of code
- Explains execution flow with thinking tags
- Helps debug by understanding runtime behavior
- Works as a "neural debugger"
🎯 Results:
- 20% overall accuracy on execution prediction
- 33.3% mean state accuracy
- Trained on Qwen3-4B-Thinking (262K context, 4B params)
🔗 Links:
Model: codelion/Qwen3-4B-execution-world-model-lora
Dataset: codelion/execution-world-model-dataset
GitHub Recipe: https://github.com/codelion/ellora
Notebook: https://github.com/codelion/ellora/blob/main/Ellora_Recipe_6_Execution_World_Model_Thinking_LoRA.ipynb
Part of the Ellora project - standardized LoRA recipes for enhancing LLM capabilities. All recipes use self-supervised data generation and work with existing infrastructure (PEFT, LoRAX, vLLM).
#LLM #LoRA #CodeGeneration #WorldModel #Qwen #AI #MachineLearning
Teaching LLMs to understand not just what code does, but HOW it executes at runtime!
Inspired by Meta's CWM (Code World Model) research, this LoRA adapter adds execution awareness to Qwen3-4B-Thinking-2507. The model learns to predict variable states, trace program execution step-by-step, and debug code by understanding runtime behavior.
🔍 Key Innovation:
We combine Qwen3's native thinking capabilities with real Python execution traces captured via sys.settrace(). The model is trained using GRPO with a custom reward function that scores execution prediction accuracy.
📊 Training Approach:
- Hybrid Magpie-style code generation
- Real execution tracing for ground truth
- Self-supervised learning (no manual annotations!)
- 298 training samples with execution traces
✨ What it does:
- Predicts variable states at each line of code
- Explains execution flow with thinking tags
- Helps debug by understanding runtime behavior
- Works as a "neural debugger"
🎯 Results:
- 20% overall accuracy on execution prediction
- 33.3% mean state accuracy
- Trained on Qwen3-4B-Thinking (262K context, 4B params)
🔗 Links:
Model: codelion/Qwen3-4B-execution-world-model-lora
Dataset: codelion/execution-world-model-dataset
GitHub Recipe: https://github.com/codelion/ellora
Notebook: https://github.com/codelion/ellora/blob/main/Ellora_Recipe_6_Execution_World_Model_Thinking_LoRA.ipynb
Part of the Ellora project - standardized LoRA recipes for enhancing LLM capabilities. All recipes use self-supervised data generation and work with existing infrastructure (PEFT, LoRAX, vLLM).
#LLM #LoRA #CodeGeneration #WorldModel #Qwen #AI #MachineLearning