Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

WildEval

non-profit
wild_eval
WildEval
Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

yuntian-deng  authored a paper 14 days ago
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
yuntian-deng  authored a paper 14 days ago
WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries
yuntian-deng  authored a paper 14 days ago
The Leaderboard Illusion
View all activity

Bill Yuchen Lin's profile picture Yuntian Deng's profile picture Abhilasha Ravichander's profile picture Valentina Pyatkin's profile picture Khyathi Raghavi Chandu's profile picture Faeze Brahman's profile picture Ronan Le Bras's profile picture Dongfu Jiang's profile picture Chengsong Huang's profile picture

spaces 1

pinned
Running
7

Zebra Logic Bench

🦓

Explore and evaluate Zebra Logic models

Apr 11

models 0

None public yet

datasets 9

WildEval/ZebraLogic

Viewer • Updated Feb 4 • 4.26k • 662 • 5

WildEval/G-PlanET

Viewer • Updated Aug 1, 2024 • 1.42k • 16

WildEval/ZeroEval

Viewer • Updated Jul 23, 2024 • 4.61k • 1.11k

WildEval/WildBench-V2

Viewer • Updated May 22, 2024 • 2.05k • 31

WildEval/WildBench-Results-v2-internal

Viewer • Updated May 21, 2024 • 30k • 61

WildEval/WildBench-Results-V2

Viewer • Updated May 20, 2024 • 10.2k • 37

WildEval/WildBench-v2-dev

Viewer • Updated Apr 19, 2024 • 5.99k • 61

WildEval/WildBench-dev

Updated Apr 19, 2024 • 5 • 1

WildEval/NaturalChats

Viewer • Updated Apr 18, 2024 • 641k • 16
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs