Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,4 @@
|
|
|
|
1 |
---
|
2 |
language:
|
3 |
- ar
|
@@ -13,20 +14,22 @@ base_model:
|
|
13 |
library_name: transformers
|
14 |
tags:
|
15 |
- code
|
|
|
16 |
---
|
17 |
-
# M3-V2:
|
18 |
|
19 |
-
[
|
32 |
|
@@ -34,44 +37,36 @@ The benchmark results demonstrate a level of performance that significantly surp
|
|
34 |
|
35 |
| Model | HumanEval Pass@1 Score | Note |
|
36 |
| :---------------------------------- | :--------------------: | :--------------------- |
|
37 |
-
| **moelanoby/phi3-M3-V2 (This Model)** | **95.12
|
38 |
| GPT-4.5 / "Orion" | `~96.00%` | Projected (Late 2025) |
|
39 |
| Gemini 2.5 Pro | `~95.00%` | Projected (Late 2025) |
|
40 |
| Claude 4 | `~94.00%` | Projected (Late 2025) |
|
41 |
|
42 |
---
|
43 |
|
44 |
-
##
|
45 |
-
|
46 |
-
This model is proprietary and is governed by the following custom terms. By accessing or using this model, you agree to be bound by these rules.
|
47 |
|
48 |
-
|
49 |
|
50 |
-
|
51 |
|
52 |
-
|
53 |
|
54 |
---
|
55 |
|
56 |
-
##
|
57 |
-
|
58 |
-
This model is available for commercial use via a paid license.
|
59 |
-
|
60 |
-
To purchase a license and gain access to the model, please contact our licensing team:
|
61 |
|
62 |
-
**
|
63 |
|
64 |
-
|
65 |
|
66 |
-
|
67 |
-
|
68 |
-
You will be provided with access credentials and usage instructions upon completion of the licensing agreement.
|
69 |
|
70 |
---
|
71 |
|
72 |
-
##
|
73 |
|
74 |
-
|
75 |
|
76 |
### Installation
|
77 |
|
@@ -83,51 +78,51 @@ pip install torch transformers accelerate
|
|
83 |
|
84 |
### Python Implementation
|
85 |
|
86 |
-
|
87 |
|
88 |
```python
|
89 |
import torch
|
90 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
91 |
|
92 |
-
# Use the private model ID and token provided with your license
|
93 |
MODEL_ID = "moelanoby/phi3-M3-V2"
|
94 |
-
# AUTH_TOKEN = "YOUR_HF_ACCESS_TOKEN_HERE" # Required for private models
|
95 |
|
96 |
print("Loading tokenizer and model...")
|
97 |
tokenizer = AutoTokenizer.from_pretrained(
|
98 |
MODEL_ID,
|
99 |
trust_remote_code=True,
|
100 |
-
# token=AUTH_TOKEN
|
101 |
)
|
102 |
model = AutoModelForCausalLM.from_pretrained(
|
103 |
MODEL_ID,
|
104 |
trust_remote_code=True,
|
105 |
torch_dtype=torch.bfloat16,
|
106 |
device_map="auto",
|
107 |
-
# token=AUTH_TOKEN
|
108 |
)
|
109 |
print("Model loaded successfully.")
|
110 |
|
111 |
-
# --- Controlling the model's
|
112 |
-
#
|
113 |
-
# Default is 1 pass.
|
114 |
try:
|
115 |
target_layer_path = "model.layers.15.mlp.gate_up_proj"
|
116 |
custom_layer = model
|
117 |
for part in target_layer_path.split('.'):
|
118 |
custom_layer = getattr(custom_layer, part)
|
119 |
|
120 |
-
|
121 |
-
|
|
|
122 |
except AttributeError:
|
123 |
print("⚠️ Could not access the custom layer. The model will run with its default settings.")
|
124 |
|
125 |
# (Example generation code would follow here)
|
126 |
```
|
127 |
-
##
|
128 |
-
-
|
129 |
-
-
|
|
|
|
|
|
|
130 |
## Acknowledgements
|
131 |
|
132 |
- The base of this model utilizes the **Phi-3** architecture developed by Microsoft.
|
133 |
-
- The benchmark results were obtained using the **HumanEval** dataset from OpenAI.
|
|
|
|
1 |
+
|
2 |
---
|
3 |
language:
|
4 |
- ar
|
|
|
14 |
library_name: transformers
|
15 |
tags:
|
16 |
- code
|
17 |
+
- open-source
|
18 |
---
|
19 |
+
# M3-V2: An Open Source Model for State-of-the-Art Code Generation
|
20 |
|
21 |
+
[](https://opensource.org/licenses/Apache-2.0)
|
22 |
+
[](https://www.paypal.me/moelanobyzedev)
|
23 |
|
24 |
+
M3-V2 is a state-of-the-art causal language model featuring a novel architecture that enables advanced reasoning and self-correction. This model is **fully open source** under the Apache 2.0 license, making it available for academic, personal, and commercial use.
|
25 |
|
26 |
+
The model achieves a groundbreaking **98.17% Pass@1 score on the HumanEval benchmark**, placing it at the cutting edge of AI code generation and making it one of the most powerful open-source code generation engines available today.
|
27 |
|
28 |
---
|
29 |
|
30 |
## Benchmark Performance
|
31 |
|
32 |
+
The benchmark results demonstrate a level of performance that significantly surpasses many publicly available models.
|
33 |
|
34 |

|
35 |
|
|
|
37 |
|
38 |
| Model | HumanEval Pass@1 Score | Note |
|
39 |
| :---------------------------------- | :--------------------: | :--------------------- |
|
40 |
+
| **moelanoby/phi3-M3-V2 (This Model)** | **95.12% / 98.17% / 98.56%** | **Apache 2.0 License**. Scores correspond to 0, 1, and 2 self-correction passes, with 1 being the default. |
|
41 |
| GPT-4.5 / "Orion" | `~96.00%` | Projected (Late 2025) |
|
42 |
| Gemini 2.5 Pro | `~95.00%` | Projected (Late 2025) |
|
43 |
| Claude 4 | `~94.00%` | Projected (Late 2025) |
|
44 |
|
45 |
---
|
46 |
|
47 |
+
## Support the Project
|
|
|
|
|
48 |
|
49 |
+
M3-V2 is an open-source project, free for everyone to use. I am passionate about creating powerful and accessible AI tools for the community.
|
50 |
|
51 |
+
If you find this model helpful in your work, research, or personal projects, please consider supporting its development. Your contribution helps cover server costs, allows me to dedicate more time to improvements, and fuels the creation of new open-source models. Every little bit helps and is greatly appreciated!
|
52 |
|
53 |
+
[**Support via PayPal**](https://www.paypal.me/moelanobyzedev)
|
54 |
|
55 |
---
|
56 |
|
57 |
+
## License
|
|
|
|
|
|
|
|
|
58 |
|
59 |
+
This model is licensed under the **Apache 2.0 License**. You are free to use, modify, and distribute this model and its source code for any purpose, including commercial applications, subject to the terms of the license. You can find a copy of the license in the repository.
|
60 |
|
61 |
+
## Ethical Considerations
|
62 |
|
63 |
+
While this model is open source, users are encouraged to use it responsibly. Finetuning the model to generate harmful, illegal, or unethical content is strongly discouraged. We advocate for the use of this technology to build positive and safe applications.
|
|
|
|
|
64 |
|
65 |
---
|
66 |
|
67 |
+
## How to Use
|
68 |
|
69 |
+
This model is publicly available on the Hugging Face Hub.
|
70 |
|
71 |
### Installation
|
72 |
|
|
|
78 |
|
79 |
### Python Implementation
|
80 |
|
81 |
+
You can easily integrate the model into your application. You **must** use `trust_remote_code=True` for the custom architecture to load correctly from the Hub.
|
82 |
|
83 |
```python
|
84 |
import torch
|
85 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
86 |
|
|
|
87 |
MODEL_ID = "moelanoby/phi3-M3-V2"
|
|
|
88 |
|
89 |
print("Loading tokenizer and model...")
|
90 |
tokenizer = AutoTokenizer.from_pretrained(
|
91 |
MODEL_ID,
|
92 |
trust_remote_code=True,
|
|
|
93 |
)
|
94 |
model = AutoModelForCausalLM.from_pretrained(
|
95 |
MODEL_ID,
|
96 |
trust_remote_code=True,
|
97 |
torch_dtype=torch.bfloat16,
|
98 |
device_map="auto",
|
|
|
99 |
)
|
100 |
print("Model loaded successfully.")
|
101 |
|
102 |
+
# --- Controlling the model's self-correction feature ---
|
103 |
+
# Default is 1 pass. You can adjust it for different performance profiles.
|
|
|
104 |
try:
|
105 |
target_layer_path = "model.layers.15.mlp.gate_up_proj"
|
106 |
custom_layer = model
|
107 |
for part in target_layer_path.split('.'):
|
108 |
custom_layer = getattr(custom_layer, part)
|
109 |
|
110 |
+
# Set the number of self-correction passes (e.g., 0, 1, 2, or 3)
|
111 |
+
custom_layer.num_correction_passes = 2
|
112 |
+
print(f"✅ Number of self-correction passes set to: {custom_layer.num_correction_passes}")
|
113 |
except AttributeError:
|
114 |
print("⚠️ Could not access the custom layer. The model will run with its default settings.")
|
115 |
|
116 |
# (Example generation code would follow here)
|
117 |
```
|
118 |
+
## Important Notes
|
119 |
+
- **Downside:** The model might become more incoherent or less accurate as you add more self-correction passes. Experiment to find the best balance for your use case.
|
120 |
+
- **Recommendations:** You can use 1, 2, or 3 self-correction passes if needed. **2 passes** is the most recommended setting for a balance of performance and coherence.
|
121 |
+
|
122 |
+
---
|
123 |
+
|
124 |
## Acknowledgements
|
125 |
|
126 |
- The base of this model utilizes the **Phi-3** architecture developed by Microsoft.
|
127 |
+
- The benchmark results were obtained using the **HumanEval** dataset from OpenAI.
|
128 |
+
- We thank the open-source community for their continuous contributions to AI research.
|