crumb
/

shrink-v1

@@ -3,9 +3,12 @@ datasets:
 - cerebras/SlimPajama-627B
 language:
 - en
 ---
-OpenLLM Leaderboard score: 34.77
 |    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
 |-------------|-------|------|-----:|--------|-----:|---|-----:|

 - cerebras/SlimPajama-627B
 language:
 - en
+tags:
+- llama
 ---
+200m-ish parameter model (I think the param count in the graphic here is wrong, but the bench values are correct) with the token embedding and language modelling head of Llama2-70b attached, with linear transformations from Llama2-70b's 8192d space down to this model's 1024d space.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6079949388160e14e4e2e499/PhqViTuOrE7s65WyVRpNX.png)
 |    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
 |-------------|-------|------|-----:|--------|-----:|---|-----:|