Ai2

non-profit

Verified

https://allenai.org/

allen_ai

allenai

AI & ML interests

Building breatkthrough AI to solve the world's biggest problems.

Recent Activity

ashish333 authored a paper 16 days ago

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

ashish333 authored a paper 16 days ago

Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering

ashish333 authored a paper 16 days ago

From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project

View all activity

Articles

Introducing the Open Chain of Thought Leaderboard

allenai 's datasets 265

allenai/IFBench_multi-turn_responses_example

Viewer • Updated Jul 3 • 1.39k • 37

allenai/IFBench_test

Viewer • Updated Jul 3 • 294 • 1.05k • 2

allenai/big-reasoning-traces

Viewer • Updated Jun 30 • 677k • 271 • 7

allenai/omega-transformative

Viewer • Updated Jun 29 • 7.2k • 203 • 6

allenai/omega-compositional

Viewer • Updated Jun 24 • 14.3k • 237 • 1

allenai/omega-explorative

Viewer • Updated Jun 24 • 52.2k • 585 • 5

allenai/IF_sft_data_verified

Viewer • Updated Jun 23 • 31.8k • 304 • 4

allenai/IF_multi_constraints_upto5_no_lang

Viewer • Updated Jun 22 • 95.4k • 23 • 2

allenai/DataDecide-ppl-results

Viewer • Updated Jun 17 • 22.7k • 32 • 2

allenai/ruler_data

Updated Jun 11 • 17

allenai/PRISM

Viewer • Updated Jun 7 • 412k • 152 • 4

allenai/SimpleToM-rich

Viewer • Updated Jun 7 • 4.59k • 5 • 1

allenai/reward-bench-2

Viewer • Updated Jun 4 • 1.87k • 3.89k • 22

allenai/sciriff-yesno

Viewer • Updated Jun 3 • 2.24k • 134

allenai/blog-images

Viewer • Updated Jun 2 • 2 • 33.7k

allenai/WildChat-4M-Full

Updated May 30 • 2

allenai/WildChat-4M

Updated May 30 • 3 • 2

allenai/qasper-yesno

Viewer • Updated May 29 • 649 • 125

allenai/olmOCR-pes2o-0225

Viewer • Updated May 16 • 7.87M • 379 • 4

allenai/discoverybench

Viewer • Updated May 10 • 264 • 402 • 12

allenai/reward-bench-results

Updated May 7 • 4.78k • 3

allenai/DataDecide-data-recipes

Updated May 6 • 5.06k • 8

allenai/olmo-2-0425-1b-preference-mix

Viewer • Updated Apr 30 • 378k • 79 • 4

allenai/DataDecide-eval-results

Viewer • Updated Apr 16 • 1.41M • 104 • 4

allenai/sqa_reranking_eval

Viewer • Updated Apr 15 • 2.43k • 19 • 2

allenai/tulu-3-do-anything-now-eval

Viewer • Updated Apr 11 • 300 • 121 • 1

allenai/tulu-3-harmbench-eval

Viewer • Updated Apr 11 • 320 • 57

allenai/tulu-3-trustllm-jailbreaktrigger-eval

Viewer • Updated Apr 11 • 400 • 100

allenai/super

Viewer • Updated Mar 21 • 801 • 783 • 4

allenai/lmarena-100k-long-sample-prompts

Viewer • Updated Mar 19 • 386 • 11 • 1