Bertrand Chevrier's picture

Bertrand Chevrier

kramp

AI & ML interests

text 2 speech, ai for music writting

Recent Activity

liked a Space 1 day ago
nvidia/parakeet-tdt-0.6b-v2
liked a model 4 days ago
nari-labs/Dia-1.6B
reacted to wolfram's post with πŸ‘ 4 days ago
Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science). A few take-aways stood out - especially for those interested in local deployment and performance trade-offs: 1️⃣ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s. 2️⃣ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend. 3️⃣ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s. 4️⃣ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups. 5️⃣ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off). All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings. **Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default. Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!
View all activity

Organizations

Hugging Face's profile picture Team 7's profile picture huggingPartyParis's profile picture Social Post Explorers's profile picture private beta for deeplinks's profile picture Fine Video's profile picture Hugging Face FineVideo's profile picture

kramp's activity

reacted to wolfram's post with πŸ‘ 4 days ago
view post
Post
6979
Finally finished my extensive **Qwen 3 evaluations** across a range of formats and quantisations, focusing on **MMLU-Pro** (Computer Science).

A few take-aways stood out - especially for those interested in local deployment and performance trade-offs:

1️⃣ **Qwen3-235B-A22B** (via Fireworks API) tops the table at **83.66%** with ~55 tok/s.
2️⃣ But the **30B-A3B Unsloth** quant delivered **82.20%** while running locally at ~45 tok/s and with zero API spend.
3️⃣ The same Unsloth build is ~5x faster than Qwen's **Qwen3-32B**, which scores **82.20%** as well yet crawls at <10 tok/s.
4️⃣ On Apple silicon, the **30B MLX** port hits **79.51%** while sustaining ~64 tok/s - arguably today's best speed/quality trade-off for Mac setups.
5️⃣ The **0.6B** micro-model races above 180 tok/s but tops out at **37.56%** - that's why it's not even on the graph (50 % performance cut-off).

All local runs were done with LM Studio on an M4 MacBook Pro, using Qwen's official recommended settings.

**Conclusion:** Quantised 30B models now get you ~98 % of frontier-class accuracy - at a fraction of the latency, cost, and energy. For most local RAG or agent workloads, they're not just good enough - they're the new default.

Well done, Qwen - you really whipped the llama's ass! And to OpenAI: for your upcoming open model, please make it MoE, with toggleable reasoning, and release it in many sizes. *This* is the future!
Β·
upvoted an article 18 days ago
view article
Article

Tiny Agents: a MCP-powered agent in 50 lines of code

β€’ 231
upvoted an article 27 days ago
view article
Article

Cohere on Hugging Face Inference Providers πŸ”₯

β€’ 124
upvoted an article 29 days ago
view article
Article

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition πŸ€–

β€’ 45
New activity in huggingface/HuggingDiscussions about 1 month ago
reacted to enzostvs's post with πŸ”₯ about 1 month ago
view post
Post
22761
(not available anymore)
Looking for a logo idea πŸ‘€ ?
I made a new cool space https://huggingface.co/spaces/enzostvs/Logo.Ai to help you design a great logo in seconds!

Here are some examples of what you can do, feel free to share yours too! πŸš€
Β·