Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Stanford Autonomous Agent Lab
university
https://www.autonomousagents.stanford.edu/
Activity Feed
Follow
12
AI & ML interests
None defined yet.
Recent Activity
sangttruong
authored
a paper
6 days ago
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code
sangttruong
authored
a paper
6 days ago
Reliable and Efficient Amortized Model-based Evaluation
danielfein
updated
a collection
8 days ago
LitBench
View all activity
Team members
8
SAA-Lab
's datasets
115
Sort: Recently updated
SAA-Lab/LitBench-Test
Viewer
•
Updated
8 days ago
•
2.38k
•
179
SAA-Lab/LitBench-Test-Release
Viewer
•
Updated
10 days ago
•
2.38k
•
29
SAA-Lab/LitBench-Test-IDs-Complete-Final
Viewer
•
Updated
10 days ago
•
2.48k
•
36
SAA-Lab/LitBench-Test-IDs-Complete
Viewer
•
Updated
11 days ago
•
2.48k
•
67
SAA-Lab/LitBench-Test-Enhanced
Viewer
•
Updated
11 days ago
•
2.48k
•
61
SAA-Lab/LitBench-Test-IDs
Viewer
•
Updated
11 days ago
•
2.48k
•
81
SAA-Lab/SLPHelmBenchmarkOutput
Preview
•
Updated
13 days ago
•
104
SAA-Lab/SLPHelmUltraSuite
Viewer
•
Updated
May 16
•
7.54k
•
244
SAA-Lab/LitBench-Rationales
Viewer
•
Updated
May 16
•
43.7k
•
41
SAA-Lab/LitBench-Train
Viewer
•
Updated
May 16
•
43.8k
•
49
SAA-Lab/human-exp-1
Viewer
•
Updated
May 15
•
40
•
7
SAA-Lab/SLPHelmDataset
Viewer
•
Updated
May 15
•
19.4k
•
513
SAA-Lab/wp_non_length_corrected
Viewer
•
Updated
May 14
•
65.5k
•
10
SAA-Lab/SLPHelmManualLabels
Viewer
•
Updated
May 14
•
926
•
62
SAA-Lab/wp_shp
Preview
•
Updated
May 14
•
9
SAA-Lab/wp_naive
Viewer
•
Updated
May 14
•
395k
•
10
SAA-Lab/test_jan25-cwv-genrm_qwen1.5b-ckptNone
Viewer
•
Updated
May 13
•
155
•
6
SAA-Lab/test_jan25-cwv-genrm_qwen3b-ckptNone
Viewer
•
Updated
May 13
•
155
•
6
SAA-Lab/test_jan25-cwv-genrm_qwen7b-ckptNone
Viewer
•
Updated
May 13
•
155
•
7
SAA-Lab/test_jan25-cwv-genrm_llama1b-ckptNone
Viewer
•
Updated
May 13
•
155
•
6
SAA-Lab/test_jan25-cwv-genrm_llama3b-ckptNone
Viewer
•
Updated
May 13
•
155
•
6
SAA-Lab/test_jan25-cwv-genrm_llama8b-ckptNone
Viewer
•
Updated
May 13
•
155
•
6
SAA-Lab/test_jan25-cwv-genrm_cot_qwen1.5b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
8
SAA-Lab/test_jan25-cwv-genrm_cot_qwen3b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
5
SAA-Lab/test_jan25-cwv-genrm_cot_qwen7b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
5
SAA-Lab/test_jan25-cwv-genrm_cot_llama1b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
5
SAA-Lab/test_jan25-cwv-genrm_cot_llama3b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
5
SAA-Lab/test_jan25-cwv-genrm_cot_llama8b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
155
•
6
SAA-Lab/test-jan24-cwv-genrm_cot_qwen1.5b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
796
•
5
SAA-Lab/test-jan24-cwv-genrm_cot_qwen3b-ckptglobal_step_324
Viewer
•
Updated
May 13
•
796
•
5
Previous
1
2
3
4
Next