Update README.md
Browse files
README.md
CHANGED
@@ -22,9 +22,10 @@ Here are some of the optimized configurations we have added:
|
|
22 |
|
23 |
|
24 |
## Performance
|
25 |
-
The ONNX models are tested on:
|
26 |
|
27 |
-
ONNX enables you to run your models on-device across CPU, GPU, NPU. With ONNX you can run your models on any machine across all silica Qualcomm, AMD, Intel, Nvidia
|
|
|
|
|
28 |
| **Model** | **Precisionl** | **Device Type** | **Execution Provider** | **Device** | **Token Generation Throughput** | **Speed up vs base model**|
|
29 |
| :------------: | :------------: | :------------: | :------------: | :------------: | :------------: | :------------:|
|
30 |
| deepseek-ai_DeepSeek-R1-Distill-Qwen-1.5B | ONNX | fp16 | CUDA | RTX 4090 | 197.195 | 4X |
|
|
|
22 |
|
23 |
|
24 |
## Performance
|
|
|
25 |
|
26 |
+
ONNX enables you to run your models on-device across CPU, GPU, NPU. With ONNX, you can run your models on any machine across all silica (Qualcomm, AMD, Intel, Nvidia, etc).
|
27 |
+
|
28 |
+
See the table below for some key benchmarks for Windows GPU and CPU devices that the ONNX models were tested on.
|
29 |
| **Model** | **Precisionl** | **Device Type** | **Execution Provider** | **Device** | **Token Generation Throughput** | **Speed up vs base model**|
|
30 |
| :------------: | :------------: | :------------: | :------------: | :------------: | :------------: | :------------:|
|
31 |
| deepseek-ai_DeepSeek-R1-Distill-Qwen-1.5B | ONNX | fp16 | CUDA | RTX 4090 | 197.195 | 4X |
|