Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

mlabonne

posted an update 1 day ago

Post

2151

Liquid just released two 450M and 1.6B param VLMs!

They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. It's ideal for on-device deployment in constrained environments like phones.

It's available today on Hugging Face, with an inference and a fine-tuning Colab notebooks.

LiquidAI/LFM2-VL-450M
LiquidAI/LFM2-VL-1.6B

mrs83

posted an update 2 days ago

Post

2276

Introducing the Computer Says No Dataset: ethicalabs/computer-says-no

An LLM can do almost anything, but should it?

This dataset provides clear examples of when LLMs should decline requests, such as:

- Counting characters (e.g., "number of 'r's in 'raspberry'" – seriously, you’ve got this)
- Solving basic equations (like *5.9 = x + 5.11* – please, show that calculator some love)

Inspired by Little Britain's iconic "Computer Says No" sketch, we address a critical issue in AI systems today: the waste of using a rocket launcher to swat flies (aka powerful models for trivial tasks).

Goals:
- Reduce waste by saving compute for tasks that actually need it
- Guide users to better tools
- Spark discussion about ethical AI

This isn’t a training set. It’s a provocation: if we don’t define AI's limits, who will?

3 replies

badaoui

posted an update 2 days ago

Post

2882

Is there a "one-size-fits-all" recipe for quantizing Large Language Models? 🤔

As part of my ongoing work in mixed-precision quantization, I've been exploring this question by measuring layer-by-layer sensitivity. The goal is to see if we can find universal rules for which layers can be quantized aggressively without impacting performance.The results are fascinating and reveal two key insights:

1️⃣ Sensitivity profiles are like architectural "fingerprints." Models from the same family share strikingly similar sensitivity patterns. As you can see in the charts below for the Gemma and SmolLM families, the ranking and relative sensitivity of the layers remain remarkably consistent. This suggests that the underlying architecture is a primary driver of a model's quantization behavior.

2️⃣ A "universal" mixed-precision quantization strategy is challenging. While models within a family are similar, these "fingerprints" change dramatically when comparing different architectures like LLaMA, Qwen, and StableLM. This highlights the difficulty in creating a generalized mixed-precision configuration that works optimally across all model families.

However, there is one near-universal truth we uncovered: the mlp.down_proj layer consistently emerges as one of the most sensitive components across all models studied.
This finding strongly resonates with the work in "The Super Weight in Large Language Models" (by Mengxia Yu et al.). The paper identifies that functionally critical parameters, or "super weights," are concentrated in these down_proj layers. Our empirical results provide clear validation for this theory, showing these layers are highly intolerant to precision loss.

In short, while every architecture has a unique sensitivity profile, a fingerprint shaped not only by its core design but also by its specific training dataset and optimization approach, some components remain universally critical!
What are your thoughts?

4 replies

ovi054

posted an update about 22 hours ago

Post

1110

Update on ovi054/Qwen-Image-LORA ⚡

You can now load a Qwen LoRA in this space as follows:

1. Model ID:

flymy-ai/qwen-image-realism-lora

2. Model link:

https://huggingface.co/flymy-ai/qwen-image-realism-lora

3. Specific file link:

https://huggingface.co/flymy-ai/qwen-image-realism-lora/blob/main/flymy_realism.safetensors

4. Direct download link:

https://huggingface.co/flymy-ai/qwen-image-realism-lora/resolve/main/flymy_realism.safetensors

You can also use an external .safetensors download link (if Hugging Face doesn’t block it).

It is useful if a model repository contains multiple weights and you want to load a specific one.

👉 Try it now: ovi054/Qwen-Image-LORA

prithivMLmods

posted an update 1 day ago

Post

1533

Try Liquid AI's all-new multimodal models: LFM2-VL-1.6B & LFM2-VL-450M! Demo with the Gradio UI and ReportLab support and both models are runnable on T4 GPU!

↗ LFM2-VL-1.6B-LiquidAI : https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/LFM2-VL-1.6B-LiquidAI/LFM2-VL-1.6B_ReportLab.ipynb

↗ LFM2-VL-450M-LiquidAI : https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/LFM2-VL-450M-LiquidAI/LFM2-VL-450M_ReportLab.ipynb

.
.
.
To know more about it, visit the multimodal outpost notebooks !!

1 reply

sergiopaniego

posted an update 2 days ago

Post

2440

So you can now SFT a model with hf jobs + TRL in ONE command lol 🏎️💨

Without worrying about infrastructure since it runs entirely on HF!

docs: https://huggingface.co/docs/huggingface_hub/main/en/guides/jobs
blog: https://huggingface.co/blog/hf-cli

nroggendorff

posted an update about 17 hours ago

Post

1145

No, I did not create those bots that just got banned today.

4 replies

kanaria007

posted an update 1 day ago

Post

400

✅ New Article: *Media as Cognitive Infrastructure*

Title:
📰 Protocolic Media: Structured Intelligence and the Future of Cognitive Environments
🔗 https://huggingface.co/blog/kanaria007/protocolic-media

---

Summary:
Media doesn’t just *deliver content* — it *shapes how collective thought moves*.
Every feed, stream, and algorithm is *a scaffold for attention and reasoning*,
determining *what we notice, connect, and forget*.

Structured Intelligence reframes media as *cognitive infrastructure*:
not passive transmission, but *active architecture for collective reasoning*.

> Media isn’t flow —
> *it’s the frame of shared cognition.*

---

Why It Matters:
• Modern media amplifies *bias, noise, and cognitive drift*
• Traditional moderation reacts *after harm occurs*
• Structured approaches support:

* *Traceable content flows with coherence checks*
* *Ethical filtering without black‑box censorship*
* *Reflective scaffolds that encourage deliberate reasoning*

---

What’s Inside:
• Media reframed as *structural mindspace*
• How feeds can *become reflective, rather than addictive*
• Educational and civic implications of *protocol‑aware media*
• Transition from *information delivery to cognition design*

---

📖 Article 15 of the Structured Intelligence Series

Where Article 14 explored *law as structured justification*,
Article 15 shows *media as collective cognition architecture* —
turning information streams into *auditable reasoning flows*.

---

Next: Acting and Structured Performance
The next article explores *performance and roleplay as cognitive architecture*,
revealing how *identity, judgment, and simulation*
can coexist *without losing self‑coherence*.

> From headlines to the stage,
> *structure carries the weight of collective imagination.*

ZennyKenny

posted an update 1 day ago

Post

1295

It's just a matter of time before all the data leakage and data scraping associated with building, training, and using AI results in some kind of major scandal.

That's why I think this paper by @spintronic is so important: Privacy-Preserving Tabular Synthetic Data Generation Using TabularARGN (2508.06647)

Glad to know that there are already researchers looking to mitigate and address this risk before the s**t hits the fan.

MonsterMMORPG

posted an update 1 day ago

Post

1405

Decoding the Shift and Diffusion Models Training Like Qwen Image, FLUX, SDXL, and More : https://huggingface.co/blog/MonsterMMORPG/decoding-the-shift-and-diffusion-models-training

Full article : https://huggingface.co/blog/MonsterMMORPG/decoding-the-shift-and-diffusion-models-training

Hopefully I am going to focus on Qwen Image training tutorial and 1-click installers with GUI and presets starting from this week. So here some important info. You don't need to know, learn or understand this but this is for people who wants to learn and understand more.

Recently active users