QuantFactory Banner

QuantFactory/Replete-LLM-V2.5-Qwen-14b-GGUF

This is quantized version of Replete-AI/Replete-LLM-V2.5-Qwen-14b created using llama.cpp

Original Model Card

Replete-LLM-V2.5-Qwen-14b

image/png

Replete-LLM-V2.5-Qwen-14b is a continues finetuned version of Qwen2.5-14B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method

This version of the model shows higher performance than the original instruct and base models.

Quants:

GGUF: https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF

Benchmarks:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 34.52
IFEval (0-Shot) 58.40
BBH (3-Shot) 49.39
MATH Lvl 5 (4-Shot) 15.63
GPQA (0-shot) 16.22
MuSR (0-shot) 18.83
MMLU-PRO (5-shot) 48.62
Downloads last month
28
GGUF
Model size
14.8B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for QuantFactory/Replete-LLM-V2.5-Qwen-14b-GGUF

Base model

Qwen/Qwen2.5-14B
Quantized
(117)
this model

Evaluation results