Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

mlabonne 
posted an update 1 day ago
view post
Post
2151
Liquid just released two 450M and 1.6B param VLMs!

They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. It's ideal for on-device deployment in constrained environments like phones.

It's available today on Hugging Face, with an inference and a fine-tuning Colab notebooks.

LiquidAI/LFM2-VL-450M
LiquidAI/LFM2-VL-1.6B
mrs83 
posted an update 2 days ago
view post
Post
2276
Introducing the Computer Says No Dataset: ethicalabs/computer-says-no

An LLM can do almost anything, but should it?

This dataset provides clear examples of when LLMs should decline requests, such as:

- Counting characters (e.g., "number of 'r's in 'raspberry'" – seriously, you’ve got this)
- Solving basic equations (like *5.9 = x + 5.11* – please, show that calculator some love)

Inspired by Little Britain's iconic "Computer Says No" sketch, we address a critical issue in AI systems today: the waste of using a rocket launcher to swat flies (aka powerful models for trivial tasks).

Goals:
- Reduce waste by saving compute for tasks that actually need it
- Guide users to better tools
- Spark discussion about ethical AI

This isn’t a training set. It’s a provocation: if we don’t define AI's limits, who will?
  • 3 replies
·
badaoui 
posted an update 2 days ago
view post
Post
2882
Is there a "one-size-fits-all" recipe for quantizing Large Language Models? 🤔

As part of my ongoing work in mixed-precision quantization, I've been exploring this question by measuring layer-by-layer sensitivity. The goal is to see if we can find universal rules for which layers can be quantized aggressively without impacting performance.The results are fascinating and reveal two key insights:

1️⃣ Sensitivity profiles are like architectural "fingerprints." Models from the same family share strikingly similar sensitivity patterns. As you can see in the charts below for the Gemma and SmolLM families, the ranking and relative sensitivity of the layers remain remarkably consistent. This suggests that the underlying architecture is a primary driver of a model's quantization behavior.

2️⃣ A "universal" mixed-precision quantization strategy is challenging. While models within a family are similar, these "fingerprints" change dramatically when comparing different architectures like LLaMA, Qwen, and StableLM. This highlights the difficulty in creating a generalized mixed-precision configuration that works optimally across all model families.

However, there is one near-universal truth we uncovered: the mlp.down_proj layer consistently emerges as one of the most sensitive components across all models studied.
This finding strongly resonates with the work in "The Super Weight in Large Language Models" (by Mengxia Yu et al.). The paper identifies that functionally critical parameters, or "super weights," are concentrated in these down_proj layers. Our empirical results provide clear validation for this theory, showing these layers are highly intolerant to precision loss.

In short, while every architecture has a unique sensitivity profile, a fingerprint shaped not only by its core design but also by its specific training dataset and optimization approach, some components remain universally critical!
What are your thoughts?
·
ovi054 
posted an update about 22 hours ago
view post
Post
1110
Update on ovi054/Qwen-Image-LORA

You can now load a Qwen LoRA in this space as follows:

1. Model ID:
flymy-ai/qwen-image-realism-lora

2. Model link:
https://huggingface.co/flymy-ai/qwen-image-realism-lora

3. Specific file link:
https://huggingface.co/flymy-ai/qwen-image-realism-lora/blob/main/flymy_realism.safetensors

4. Direct download link:
https://huggingface.co/flymy-ai/qwen-image-realism-lora/resolve/main/flymy_realism.safetensors

You can also use an external .safetensors download link (if Hugging Face doesn’t block it).

It is useful if a model repository contains multiple weights and you want to load a specific one.

👉 Try it now: ovi054/Qwen-Image-LORA
prithivMLmods 
posted an update 1 day ago
view post
Post
1533
Try Liquid AI's all-new multimodal models: LFM2-VL-1.6B & LFM2-VL-450M! Demo with the Gradio UI and ReportLab support and both models are runnable on T4 GPU!

↗ LFM2-VL-1.6B-LiquidAI : https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/LFM2-VL-1.6B-LiquidAI/LFM2-VL-1.6B_ReportLab.ipynb

↗ LFM2-VL-450M-LiquidAI : https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/LFM2-VL-450M-LiquidAI/LFM2-VL-450M_ReportLab.ipynb

.
.
.
To know more about it, visit the multimodal outpost notebooks !!
  • 1 reply
·
sergiopaniego 
posted an update 2 days ago
nroggendorff 
posted an update about 17 hours ago
view post
Post
1145
No, I did not create those bots that just got banned today.
·
kanaria007 
posted an update 1 day ago
view post
Post
400
✅ New Article: *Media as Cognitive Infrastructure*

Title:
📰 Protocolic Media: Structured Intelligence and the Future of Cognitive Environments
🔗 https://huggingface.co/blog/kanaria007/protocolic-media

---

Summary:
Media doesn’t just *deliver content* — it *shapes how collective thought moves*.
Every feed, stream, and algorithm is *a scaffold for attention and reasoning*,
determining *what we notice, connect, and forget*.

Structured Intelligence reframes media as *cognitive infrastructure*:
not passive transmission, but *active architecture for collective reasoning*.

> Media isn’t flow —
> *it’s the frame of shared cognition.*

---

Why It Matters:
• Modern media amplifies *bias, noise, and cognitive drift*
• Traditional moderation reacts *after harm occurs*
• Structured approaches support:

* *Traceable content flows with coherence checks*
* *Ethical filtering without black‑box censorship*
* *Reflective scaffolds that encourage deliberate reasoning*

---

What’s Inside:
• Media reframed as *structural mindspace*
• How feeds can *become reflective, rather than addictive*
• Educational and civic implications of *protocol‑aware media*
• Transition from *information delivery to cognition design*

---

📖 Article 15 of the Structured Intelligence Series

Where Article 14 explored *law as structured justification*,
Article 15 shows *media as collective cognition architecture* —
turning information streams into *auditable reasoning flows*.

---

Next: Acting and Structured Performance
The next article explores *performance and roleplay as cognitive architecture*,
revealing how *identity, judgment, and simulation*
can coexist *without losing self‑coherence*.

> From headlines to the stage,
> *structure carries the weight of collective imagination.*
ZennyKenny 
posted an update 1 day ago
MonsterMMORPG 
posted an update 1 day ago
view post
Post
1405
Decoding the Shift and Diffusion Models Training Like Qwen Image, FLUX, SDXL, and More : https://huggingface.co/blog/MonsterMMORPG/decoding-the-shift-and-diffusion-models-training

Full article : https://huggingface.co/blog/MonsterMMORPG/decoding-the-shift-and-diffusion-models-training

Hopefully I am going to focus on Qwen Image training tutorial and 1-click installers with GUI and presets starting from this week. So here some important info. You don't need to know, learn or understand this but this is for people who wants to learn and understand more.