Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Fan Zhou's picture
85 49 62

Fan Zhou

koalazf99
Fishtiks's profile picture dreamerdeo's profile picture Cameron-Chen's profile picture
·
https://koalazf99.github.io/
  • FaZhou_998
  • koalazf99

AI & ML interests

Deep Learning; Natural Language Processing; Foundation Models

Recent Activity

new activity 3 days ago
nvidia/Nemotron-CrossThink:When will the models be accessible?
new activity 5 days ago
LLM360/MegaMath:Megamath-code parquets do not contain text column
updated a dataset 6 days ago
nanoverl/aime2025_repeated_8x
View all activity

Organizations

Spaces-explorers's profile picture XLang NLP Lab's profile picture SII - GAIR's profile picture OpenLemur's profile picture LLM360's profile picture GAIR-ProX's profile picture code-world-model's profile picture Chinese LLMs on Hugging Face's profile picture Sailor2's profile picture HKUSTNLP-VLM's profile picture Sea AI Lab-Sailor's profile picture Data Is Better Together Contributor's profile picture finemath's profile picture Computer Intelligence Project's profile picture nanoverl's profile picture OctoThinker's profile picture

Collections 3

🐙 OctoThinker
Revisiting Mid-Training In the Era of RL Scaling
  • OctoThinker/OctoThinker-8B-Hybrid-Base

    Updated 20 days ago • 1 • 2
  • OctoThinker/OctoThinker-8B-Short-Base

    Updated 20 days ago • 1
  • OctoThinker/OctoThinker-8B-Long-Base

    Updated 20 days ago • 33
  • OctoThinker/OctoThinker-3B-Short-Zero

    Updated 20 days ago • 2
💎 MegaMath
An Open Math Pre-trainng Dataset with 370B Tokens.
  • MegaMath: Pushing the Limits of Open Math Corpora

    Paper • 2504.02807 • Published Apr 3 • 30
  • LLM360/MegaMath

    Viewer • Updated Apr 9 • 217M • 33k • 89
  • LLM360/MegaMath-Llama-3.2-3B

    Text Generation • Updated 28 days ago • 27 • 4
  • LLM360/MegaMath-Llama-3.2-1B

    Text Generation • Updated 28 days ago • 54 • 1

Papers 7

arxiv:2504.02807
arxiv:2502.12982
arxiv:2412.17451
arxiv:2409.17115

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs