Need4Speed

company

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

lvwerra authored a paper about 2 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

loubnabnl authored a paper 2 months ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

zhentaoyu authored a paper 3 months ago

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

View all activity

wenhuach

posted an update 22 days ago

Post

1914

🚀 AutoRound(https://github.com/intel/auto-round) Now Supports GGUF Export & Custom Bit Settings!

We're excited to announce that AutoRound now supports:
✅ GGUF format export – for seamless compatibility with popular inference engines.
✅ Custom bit settings – tailor quantization to your needs for optimal performance.

Check out these newly released models:
🔹Intel/Qwen3-235B-A22B-Instruct-2507-gguf-q4km-AutoRound
🔹Intel/Qwen3-235B-A22B-Instruct-2507-gguf-q2ks-mixed-AutoRound
🔹Intel/Kimi-K2-Instruct-gguf-q2ks-mixed-AutoRound

Stay tuned! An even more advanced algorithm for some configurations is coming soon.

lvwerra

authored a paper about 2 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 65

loubnabnl

authored a paper 2 months ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5 • 46

zhentaoyu

authored a paper 3 months ago

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

Paper • 2503.18860 • Published Mar 24 • 6

wenhuach

posted an update 3 months ago

Post

1902

AutoRound(https://github.com/intel/auto-round) has been integrated into vLLM , allowing you to run AutoRound-formatted models directly in the upcoming release.

Beside, we strongly recommend using AutoRound to generate AWQ INT4 models, as AutoAWQ is no longer maintained and manually configuring new models is not trivial due to the need for custom layer mappings.

loubnabnl

posted an update 3 months ago

Post

3726

SmolVLM is now available on PocketPal — you can run it offline on your smartphone to interpret the world around you. 🌍📱

And check out this real-time camera demo by @ngxson , powered by llama.cpp:
https://github.com/ngxson/smolvlm-realtime-webcam
https://x.com/pocketpal_ai

3 replies

zhentaoyu

authored a paper 3 months ago

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

Paper • 2505.04512 • Published May 7 • 36

wenhuach

posted an update 4 months ago

Post

1938

AutoRound(https://github.com/intel/auto-round) has been integrated into Transformers, allowing you to run AutoRound-formatted models directly in the upcoming release. Additionally, we are actively working on supporting the GGUF double-quant format, e.g. q4_k_s, stay tuned!

https://huggingface.co/blog/autoround

lvwerra

authored a paper 4 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 197

loubnabnl

authored a paper 4 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 197

Haihao

authored a paper 5 months ago

Faster Inference of LLMs using FP8 on the Intel Gaudi

Paper • 2503.09975 • Published Mar 13 • 1

wenhuach

posted an update 5 months ago

Post

2537

Check out [DeepSeek-R1 INT2 model( OPEA/DeepSeek-R1-int2-mixed-sym-inc). This 200GB DeepSeek-R1 model shows only about a 2% drop in MMLU, though it's quite slow due to kernel issue.

| | BF16 | INT2-mixed |
| ------------- | ------ | ---------- |
| mmlu | 0.8514 | 0.8302 |
| hellaswag | 0.6935 | 0.6657 |
| winogrande | 0.7932 | 0.7940 |
| arc_challenge | 0.6212 | 0.6084 |

wenhuach

posted an update 6 months ago

Post

748

OPEA Space has released several quantized DeepSeek models, including INT2. Explore them here
OPEA/deepseek-6784a012d91191015587584a

moshew

authored a paper 6 months ago

SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models

Paper • 2502.09390 • Published Feb 13 • 16

loubnabnl

authored a paper 6 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 241

lvwerra

authored a paper 6 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 241

lvwerra

authored a paper 7 months ago

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 64

wenhuach

posted an update 8 months ago

Post

2347

Are we the only providers of INT4 quantized models for Llama 3.2 VL?
OPEA/Llama-3.2-90B-Vision-Instruct-int4-sym-inc
OPEA/Llama-3.2-11B-Vision-Instruct-int4-sym-inc

3 replies

wenhuach

posted an update 8 months ago

Post

1830

AutoRound has demonstrated strong results even at 2-bit precision for VLM models like QWEN2-VL-72B. Check it out here: OPEA/Qwen2-VL-72B-Instruct-int2-sym-inc.

4 replies

wenhuach

posted an update 8 months ago

Post

348

This week, OPEA Space released several new INT4 models, including:
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
allenai/OLMo-2-1124-13B-Instruct
THUDM/glm-4v-9b
AIDC-AI/Marco-o1
and several others.
Let us know which models you'd like prioritized for quantization, and we'll do our best to make it happen!

OPEA

3 replies

AI & ML interests

Recent Activity

Team members 19

need-for-speed's activity