apple
/

DiffuCoder-7B-cpGRPO

text-diffusion-model

diffusion large language model

Model card Files Files and versions Community

DiffuCoder-7B-cpGRPO / README.md

yizheapple's picture

Create README.md (#1)

fd7c2dc verified 2 days ago

|

history blame contribute delete

960 Bytes

	---
	license: unknown
	base_model:
	- apple/DiffuCoder-7B-Instruct
	tags:
	- code
	- text-diffusion-model
	- diffusion large language model
	---

	### DiffuCoder-7B-cpGRPO

	The DiffuCoder-7B-cpGRPO variant further refines DiffuCoder-Instruct with reinforcement learning via Coupled-GRPO.

	Training recipe:

	- Initialized from DiffuCoder-7B-Instruct, post-training with coupled-GRPO on 21K code data (1 epoch).
	- coupled-GRPO significantly improves DiffuCoder's performance on code generation benchmarks (+4.4\% on EvalPlus) and reduces reliance on AR bias during decoding.


	#### More details and usage examples:

	- Paper: [DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation](https://arxiv.org/abs/2506.20639)

	- GitHub: https://github.com/apple/ml-diffucoder

	#### Acknowledgement
	To power this HuggingFace model release, we reuse [Dream](https://huggingface.co/Dream-org/Dream-v0-Base-7B)'s modeling architecture and generation utils.