Update README.md
Browse files
README.md
CHANGED
@@ -24,6 +24,18 @@ without relying on the cloud.
|
|
24 |
|
25 |
For the MMLU evaluation, we use a 0-shot CoT setting.
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
## Model Card
|
28 |
|
29 |
<div align="center">
|
|
|
24 |
|
25 |
For the MMLU evaluation, we use a 0-shot CoT setting.
|
26 |
|
27 |
+
## Speed
|
28 |
+
| Model | Memory(GiB) | i9 14900 | 1+13 8gen4 | rk3588 (16G) | rk3576 | Raspberry PI 5 | RDK X5 | rk3566 |
|
29 |
+
|-----------------------------------------------|---------------------|----------|------------|--------------|--------|----------------|--------|--------|
|
30 |
+
| SmallThinker 4B+sparse ffn +sparse lm_head | 2.24 | 108.17 | 78.99 | 39.76 | 15.10 | 28.77 | 7.23 | 6.33 |
|
31 |
+
| SmallThinker 4B+sparse ffn +sparse lm_head+limited memory | limit 1G| 29.99 | 20.91 | 15.04 | 2.60 | 0.75 | 0.67 | 0.74 |
|
32 |
+
| Qwen3 0.6B | 0.6 | 148.56 | 94.91 | 45.93 | 15.29 | 27.44 | 13.32 | 9.76 |
|
33 |
+
| Qwen3 1.7B | 1.3 | 62.24 | 41.00 | 20.29 | 6.09 | 11.08 | 6.35 | 4.15 |
|
34 |
+
| Qwen3 1.7B+limited memory | limit 1G | 2.66 | 1.09 | 1.00 | 0.47 | - | - | 0.11 |
|
35 |
+
| Gemma3n E2B | 1G, theoretically | 36.88 | 27.06 | 12.50 | 3.80 | 6.66 | 3.46 | 2.45 |
|
36 |
+
|
37 |
+
Note:i9 14900、1+13 8ge4 use 4 threads,others use the number of threads that can achieve the maximum speed.
|
38 |
+
|
39 |
## Model Card
|
40 |
|
41 |
<div align="center">
|