Spaces:
Running
Running
Update index.html
Browse files- index.html +3 -0
index.html
CHANGED
@@ -108,6 +108,9 @@
|
|
108 |
</head>
|
109 |
<body>
|
110 |
<h1>LLM Benchmark overview</h1>
|
|
|
|
|
|
|
111 |
<div class="filter">
|
112 |
<label for="metricFilter">Filter by Evaluated task:</label>
|
113 |
<select id="metricFilter">
|
|
|
108 |
</head>
|
109 |
<body>
|
110 |
<h1>LLM Benchmark overview</h1>
|
111 |
+
<div>Overview of Benchmarks for LLM Evaluation
|
112 |
+
|
113 |
+
As the development and evaluation of large language models (LLMs) continue to evolve, I conducted an overview of the principal benchmarks commonly found in research papers. My goal is to create a clear and comprehensive resource that summarizes what is being tested in LLMs, with concrete examples, key metrics, and direct links to related papers and repositories. This document serves as a centralized matrix that will be continuously updated with insights from future papers I review.</div>
|
114 |
<div class="filter">
|
115 |
<label for="metricFilter">Filter by Evaluated task:</label>
|
116 |
<select id="metricFilter">
|