Update README.md
Browse files
README.md
CHANGED
@@ -12,9 +12,8 @@ An experimentation regarding 'lasering' each expert to denoise and enhance model
|
|
12 |
|
13 |
This model has half size in comparison to the Mixtral 8x7b Instruct. And it basically has the same level of performance (we are working to get a better MMLU score).
|
14 |
|
15 |
-
Used models (all lasered using laserRMT, except for the base model):
|
16 |
|
17 |
-
# Laserxtral - 4x7b
|
18 |
|
19 |
This model is a Mixture of Experts (MoE) made with [mergekit](https://github.com/cg123/mergekit) (mixtral branch). It uses the following base models:
|
20 |
* [cognitivecomputations/dolphin-2.6-mistral-7b-dpo](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo)
|
|
|
12 |
|
13 |
This model has half size in comparison to the Mixtral 8x7b Instruct. And it basically has the same level of performance (we are working to get a better MMLU score).
|
14 |
|
|
|
15 |
|
16 |
+
# Laserxtral - 4x7b (all lasered using laserRMT)
|
17 |
|
18 |
This model is a Mixture of Experts (MoE) made with [mergekit](https://github.com/cg123/mergekit) (mixtral branch). It uses the following base models:
|
19 |
* [cognitivecomputations/dolphin-2.6-mistral-7b-dpo](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo)
|