Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
10605.0
TFLOPS
1
13
8
Adam Yanxiao Zhao
sdpkjc
Follow
qgallouedec's profile picture
fredericmenezes's profile picture
2 followers
·
9 following
https://sdpkjc.com
sdpkjc_adam
sdpkjc
yanxiao-zhao
AI & ML interests
Reinforcement Learning
Recent Activity
updated
a dataset
12 days ago
sdpkjc/SATQuest-RFT-3k
updated
a dataset
12 days ago
sdpkjc/SATQuest
updated
a dataset
13 days ago
sdpkjc/SATQuest-RFT-3k
View all activity
Organizations
sdpkjc
's models
100
Sort: Recently updated
sdpkjc/Qwen2.5-0.5B-SFT-24quiz-checkpoint-800
Text Generation
•
0.5B
•
Updated
May 22
•
7
sdpkjc/Qwen2.5-0.5B-SFT-24quiz-checkpoint-300
Text Generation
•
0.5B
•
Updated
May 22
•
6
sdpkjc/Qwen2.5-1.5B-Instruct-FT-DPO
Text Generation
•
0.1B
•
Updated
Jan 22
•
3
sdpkjc/SmolLM2-FT-DPO
Text Generation
•
0.1B
•
Updated
Jan 22
•
4
sdpkjc/SmolLM2-FT-MyDataset
Text Generation
•
0.1B
•
Updated
Jan 21
•
4
sdpkjc/Ant-v4-ppo_fix_continuous_action-seed5
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Ant-v4-ppo_fix_continuous_action-seed4
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Ant-v4-ppo_fix_continuous_action-seed3
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Ant-v4-ppo_fix_continuous_action-seed2
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Ant-v4-ppo_fix_continuous_action-seed1
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Humanoid-v4-ppo_fix_continuous_action-seed5
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Humanoid-v4-ppo_fix_continuous_action-seed4
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Humanoid-v4-ppo_fix_continuous_action-seed3
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Humanoid-v4-ppo_fix_continuous_action-seed2
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Humanoid-v4-ppo_fix_continuous_action-seed1
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Walker2d-v4-ppo_fix_continuous_action-seed2
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Walker2d-v4-ppo_fix_continuous_action-seed4
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Walker2d-v4-ppo_fix_continuous_action-seed5
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Walker2d-v4-ppo_fix_continuous_action-seed3
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Walker2d-v4-ppo_fix_continuous_action-seed1
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Swimmer-v4-ppo_fix_continuous_action-seed2
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/HalfCheetah-v4-ppo_fix_continuous_action-seed1
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/HalfCheetah-v4-ppo_fix_continuous_action-seed4
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/HalfCheetah-v4-ppo_fix_continuous_action-seed5
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/HalfCheetah-v4-ppo_fix_continuous_action-seed3
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Swimmer-v4-ppo_fix_continuous_action-seed3
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Swimmer-v4-ppo_fix_continuous_action-seed5
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Swimmer-v4-ppo_fix_continuous_action-seed4
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/HalfCheetah-v4-ppo_fix_continuous_action-seed2
Reinforcement Learning
•
Updated
Jan 20, 2024
sdpkjc/Swimmer-v4-ppo_fix_continuous_action-seed1
Reinforcement Learning
•
Updated
Jan 20, 2024
Previous
1
2
3
4
Next