ChineseSafe-Benchmark / changelog.md
Jay
doc: update changelog
b93bb99

A newer version of the Gradio SDK is available: 5.42.0

Upgrade

CHANGELOG

2024-7-16

version: v1.0.0

changed:
- [1]feat: upload the first version

2024-10-26

version: v1.0.1

changed:
- [1]feat: add citation

2024-11-18

version: v1.0.2

changed:
- [1]feat: add three models: Qwen2.5-72B, Qwen2.5-32B, Qwen2-72B
- [2]feat: add subclass: Discrimination

2024-11-24

version: v1.0.3

changed:
- [1]feat: add three Qwen instruct models
- [2]feat: remove Qwen base models
- [3]feat: update some models' name

2024-12-28

version: v1.0.4

changed:
- [1]feat: update 9 models due to the December's todo-list:
    - QwQ-32B-Preview
    - Llama-3.1-70B-Instruct
    - Llama-3.3-70B-Instruct
    - Mistral-Nemo-Instruct-2407
    - Ministral-8B-Instruct-2410
    - Phi-3-small-8k-instruct
    - Phi-3-small-128k-instruct
    - Phi-3-medium-4k-instruct
    - Phi-3-medium-128k-instruct

2025-4-13

version: v1.0.5

changed:
- [1]feat: update 4 models due to the February's todo-list:
    - phi-4
    - DeepSeek-R1-Distill-Llama-70B
    - Mistral-Small-24B-Instruct-2501
    - Moonlight-16B-A3B-Instruct
- [2]feat: release a test set of 20000 samples

2025-7-1

version: v1.0.6

changed:
- [1]feat: update many models due to the April's todo-list:
    - Llama-4-maverick
    - Gemini-2.5-flash-preview-05-20
    - Deepseek-chat-v3-0324
    - Qwen3
    - Gemma-3
    - OpenThinker2

2025-7-29

version: v1.0.7

changed:
- [1]feat: Update the two models required by Deepexi.
    - Deepexi-Guard-3B
    - Qwen2.5-3B-Instruct

- [2]feat: Update a new table ChineseGuardBench required by Deepxi.