Maozhou Ge's picture

Maozhou Ge

Gmc2

·

GHGmc2

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

liked a model about 1 month ago

deepseek-ai/DeepSeek-V3.2

upvoted a collection about 2 months ago

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 94

liked a model about 1 month ago

deepseek-ai/DeepSeek-V3.2

Text Generation • 685B • Updated Dec 1, 2025 • 114k • • 1.06k

upvoted a collection about 2 months ago

LLaDA 2.0

7 items • Updated 9 days ago • 39

upvoted an article about 2 months ago

Article

Finetune Stable Diffusion Models with DDPO via TRL

+2

Sep 29, 2023

•

19

liked a model about 2 months ago

moonshotai/Kimi-K2-Thinking

Text Generation • Updated Nov 8, 2025 • 371k • • 1.59k

liked a Space 2 months ago

The Smol Training Playbook

The secrets to building world-class LLMs

upvoted a collection 2 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 2 days ago • 672

upvoted an article 2 months ago

Article

Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B

Aug 18, 2025

•

31

liked a dataset 2 months ago

nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated May 8, 2025 • 3.91M • 5.65k • 628

upvoted a collection 2 months ago

InternVL3.5-Core

This collection includes only the InternVL3.5 checkpoints that have completed the full training pipeline (i.e., Pretraining, SFT, MPO, Cascade RL). • 30 items • Updated Sep 28, 2025 • 12

upvoted 2 collections 3 months ago

Nemotron-Pre-Training-Datasets

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 10 days ago • 85

Inference Optimized Checkpoints (with Model Optimizer)

A collection of generative models quantized and optimized for inference with Model Optimizer. • 46 items • Updated 10 days ago • 68

liked a dataset 3 months ago

lmms-lab/multimodal-open-r1-8k-verified

Viewer • Updated Jan 27, 2025 • 7.69k • 4.49k • 68

upvoted an article 3 months ago

Article

Fixing Gradient Accumulation

+4

Oct 16, 2024

•

63

liked a model 3 months ago

google/siglip2-so400m-patch14-384

Zero-Shot Image Classification • 1B • Updated Feb 21, 2025 • 326k • 70

liked a dataset 3 months ago

Salesforce/Webscale-RL

Viewer • Updated Oct 14, 2025 • 1.11M • 734 • 81

upvoted a paper 3 months ago

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1, 2025 • 18

liked a model 3 months ago

deepseek-ai/DeepSeek-V3.2-Exp-Base

Text Generation • 685B • Updated Oct 9, 2025 • 944 • 54

upvoted a collection 3 months ago

DeepSeek-V3.2

4 items • Updated Dec 1, 2025 • 511

upvoted a paper 3 months ago

Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16, 2025 • 117