OpsEval / data_v2 /zte_en_mc_gen.csv
Junetheriver's picture
update leaderboard 2024-09-06
fe35dbb
raw
history blame
1.07 kB
name,zero_self_con,zero_cot_self_con,few_self_con,few_cot_self_con
Baichuan-13B-Chat,14.31,18.46,15.68,16.82
Chatglm2-6B,16.06,19.91,26.22,28.37
Chatglm3-6B,30.4,30.7,26.9,37.2
Chinese-Alpaca-2-13B,20.86,23.08,29.75,32.83
Chinese-Llama-2-13B,10.02,19.51,34.51,33.34
Devops-Model-14B-Chat,30.51,47.37,49.38,47.23
Ernie-Bot-4.0,43.66,51.99,44.0,50.0
Gpt-3.5-Turbo,34.82,43.5,39.19,42.58
Gpt-4,,65.49,,63.54
Internlm-7B,20.48,23.85,23.69,26.06
Internlm2-Chat-20B,39.1,37.7,47.7,33.5
Internlm2-Chat-7B,36.8,31.7,46.3,36.9
Llama-2-13B,18.32,34.45,29.14,44.3
Llama-2-70B-Chat,23.64,39.31,39.12,47.9
Llama-2-7B,21.62,27.11,24.85,34.83
Mistral-7B,26.91,30.65,40.52,46.84
Qwen-14B-Chat,36.25,42.51,50.39,59.18
Qwen-72B-Chat,53.19,55.52,58.13,58.99
Qwen-7B-Chat,33.74,34.1,32.7,36.65
Yi-34B-Chat,37.04,52.1,61.19,53.39
Claude-3-Opus,49.599999999999994,,,
gemma_2b,20.1,24.2,31.2,35.5
gemma_7b,23.1,34.4,21.4,33.1
Meta-Llama-3-70B-Instruct,38.9,63.4,37.6,59.0
Meta-Llama-3-8B-Instruct,24.7,35.4,19.7,32.9
Qwen1.5-14B-Base,34.0,42.8,57.9,40.2
Qwen1.5-14B-Chat,35.6,41.1,34.7,47.4