Need4Speed

company
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

wenhuachΒ 
posted an update 22 days ago
view post
Post
1914
πŸš€ AutoRound(https://github.com/intel/auto-round) Now Supports GGUF Export & Custom Bit Settings!

We're excited to announce that AutoRound now supports:
βœ… GGUF format export – for seamless compatibility with popular inference engines.
βœ… Custom bit settings – tailor quantization to your needs for optimal performance.

Check out these newly released models:
πŸ”ΉIntel/Qwen3-235B-A22B-Instruct-2507-gguf-q4km-AutoRound
πŸ”ΉIntel/Qwen3-235B-A22B-Instruct-2507-gguf-q2ks-mixed-AutoRound
πŸ”ΉIntel/Kimi-K2-Instruct-gguf-q2ks-mixed-AutoRound

Stay tuned! An even more advanced algorithm for some configurations is coming soon.
wenhuachΒ 
posted an update 3 months ago
view post
Post
1902
AutoRound(https://github.com/intel/auto-round) has been integrated into vLLM , allowing you to run AutoRound-formatted models directly in the upcoming release.

Beside, we strongly recommend using AutoRound to generate AWQ INT4 models, as AutoAWQ is no longer maintained and manually configuring new models is not trivial due to the need for custom layer mappings.
loubnabnlΒ 
posted an update 3 months ago
wenhuachΒ 
posted an update 4 months ago
wenhuachΒ 
posted an update 5 months ago
view post
Post
2537
Check out [DeepSeek-R1 INT2 model( OPEA/DeepSeek-R1-int2-mixed-sym-inc). This 200GB DeepSeek-R1 model shows only about a 2% drop in MMLU, though it's quite slow due to kernel issue.

| | BF16 | INT2-mixed |
| ------------- | ------ | ---------- |
| mmlu | 0.8514 | 0.8302 |
| hellaswag | 0.6935 | 0.6657 |
| winogrande | 0.7932 | 0.7940 |
| arc_challenge | 0.6212 | 0.6084 |
wenhuachΒ 
posted an update 6 months ago
wenhuachΒ 
posted an update 8 months ago
wenhuachΒ 
posted an update 8 months ago
view post
Post
1830
AutoRound has demonstrated strong results even at 2-bit precision for VLM models like QWEN2-VL-72B. Check it out here: OPEA/Qwen2-VL-72B-Instruct-int2-sym-inc.
  • 4 replies
Β·
wenhuachΒ 
posted an update 8 months ago
view post
Post
348
This week, OPEA Space released several new INT4 models, including:
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
allenai/OLMo-2-1124-13B-Instruct
THUDM/glm-4v-9b
AIDC-AI/Marco-o1
and several others.
Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen!

OPEA
  • 3 replies
Β·