Update README.md
Browse files
README.md
CHANGED
@@ -34,7 +34,7 @@ The benchmark results demonstrate a level of performance that significantly surp
|
|
34 |
|
35 |
| Model | HumanEval Pass@1 Score | Note |
|
36 |
| :---------------------------------- | :--------------------: | :--------------------- |
|
37 |
-
| **moelanoby/phi3-M3-V2 (This Model)** | **98.17%** | **Commercial License** |
|
38 |
| GPT-4.5 / "Orion" | `~96.00%` | Projected (Late 2025) |
|
39 |
| Gemini 2.5 Pro | `~95.00%` | Projected (Late 2025) |
|
40 |
| Claude 4 | `~94.00%` | Projected (Late 2025) |
|
@@ -121,7 +121,9 @@ except AttributeError:
|
|
121 |
|
122 |
# (Example generation code would follow here)
|
123 |
```
|
124 |
-
|
|
|
|
|
125 |
## Acknowledgements
|
126 |
|
127 |
- The base of this model utilizes the **Phi-3** architecture developed by Microsoft.
|
|
|
34 |
|
35 |
| Model | HumanEval Pass@1 Score | Note |
|
36 |
| :---------------------------------- | :--------------------: | :--------------------- |
|
37 |
+
| **moelanoby/phi3-M3-V2 (This Model)** | **95.12%/98.17%/98.56%** | **Commercial License** and they are ordered with 0,1,2 self corrections |
|
38 |
| GPT-4.5 / "Orion" | `~96.00%` | Projected (Late 2025) |
|
39 |
| Gemini 2.5 Pro | `~95.00%` | Projected (Late 2025) |
|
40 |
| Claude 4 | `~94.00%` | Projected (Late 2025) |
|
|
|
121 |
|
122 |
# (Example generation code would follow here)
|
123 |
```
|
124 |
+
## HUGE NOTES
|
125 |
+
- downside: the model might grow more incoherent and less accurate as you add more self corrections
|
126 |
+
- recommendations: you could use 1,2,3 self corrections if needed and 2 self corrections is the most recommended
|
127 |
## Acknowledgements
|
128 |
|
129 |
- The base of this model utilizes the **Phi-3** architecture developed by Microsoft.
|