liyucheng commited on
Commit
259df53
·
1 Parent(s): cb0f4bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -13,7 +13,7 @@ pinned: false
13
 
14
  # "Uncheatable" LLMs Evaluation - LatestEval
15
 
16
- Humans receive new test questions every exam, but LLMs? They've been evaluated with the same benchmarks for too long. Why not assess LLMs with fresh test just like we test our students? In this project, we introduce LatestEval, which automatically constructs language model benchmarks using the latest materials (e.g., arXiv, BBC, Wikipedia, etc.) to prevent "cheating" and data contamination.
17
 
18
  **News!!**
19
 
 
13
 
14
  # "Uncheatable" LLMs Evaluation - LatestEval
15
 
16
+ Humans receive new test questions every exam, but LLMs? They've been evaluated with the same benchmarks for too long. Why not assess LLMs with fresh test just like we test our students? In this project, we introduce LatestEval, which automatically constructs language model benchmarks using the latest materials (e.g., arXiv, BBC, GitHub, etc.) to prevent "cheating" and data contamination.
17
 
18
  **News!!**
19