AI & ML interests

None defined yet.

Recent Activity

fdaudensย 
posted an update about 8 hours ago
view post
Post
154
Want to learn to build an AI Agent? I put together a cookbook for creating your own news research agent with OpenAI GPT-OSS:

- Searches headlines & specific sites
- Pulls full articles when you need depth
- Summarizes with clickable sources
- Runs in a simple Gradio chat UI
- No GPU, no local setup โ€” just open-weight GPT-OSS models via Hugging Face

If youโ€™ve been wanting to try agents but werenโ€™t sure where to start, this is an end-to-end example you can fork, run, and adapt.

Full guide + code https://huggingface.co/blog/fdaudens/openai-gpt-oss-agent-inference-providers
fdaudensย 
posted an update 2 days ago
view post
Post
277
What can OpenAIโ€™s new open models do with the news? I built a News Agent to find out.

It can answer questions about the news in real time, and every answer comes with original source links so you can dive deeper.

Ask it things like:
- "What are the top news stories today?"
- "What's the latest on artificial intelligence?"
- Follow-up questions on specific stories

Runs with Hugging Face inference providers, letting you compare results from the OpenAI 20B and 120B models

So far, Iโ€™m quite impressed by the capabilities of even the smaller 20B model. Definitely not a perfect project, but curious to hear your thoughts!

fdaudens/gpt-oss-news-agent
  • 2 replies
ยท
megย 
posted an update 3 days ago
BrigitteTousiย 
posted an update 3 days ago
fdaudensย 
posted an update 3 days ago
view post
Post
3235
OpenAIโ€™s GPT-OSS has sparked ~400 new models on Hugging Face and racked up 5M downloads in less than a week, already outpacing DeepSeek R1โ€™s first-week numbers.

For comparison: when R1 launched, I tracked 550 derivatives (across 8 base models) in a week, with ~3M downloads. GPT-OSS is ahead on adoption and engagement.

Itโ€™s also the most-liked release of any major LLM this summer. The 20B and 120B versions quickly shot past Kimi K2, GLM 4.5, and others in likes.

Most-downloaded GPT-OSS models include LM Studio and Unsloth AI versions:
1๏ธโƒฃ openai/gpt-oss-20b - 2.0M
2๏ธโƒฃ lmstudio-community/gpt-oss-20b-MLX-8bit - 750K
3๏ธโƒฃ openai/gpt-oss-120b - 430K
4๏ธโƒฃ unsloth/gpt-oss-20b-GGUF - 380K
5๏ธโƒฃ lmstudio-community/gpt-oss-20b-GGUF - 330K

The 20B version is clearly finding its audience, showing the power of smaller, faster, more memory- and energy-efficient models. (These numbers donโ€™t include calls to the models via inference providers, so the real usage is likely even bigger, especially for the 120B version)

Open-weight models let anyone build on top. Empower the builders, and innovation takes off. ๐Ÿš€
  • 1 reply
ยท
BrigitteTousiย 
posted an update 7 days ago
view post
Post
409
New interactive viz from AI World showing OpenAI's new open model gpt-oss-120b breaking into the top 50 most liked models of all time on the Hub in under a day! โ˜„๏ธโ˜„๏ธโ˜„๏ธ
megย 
posted an update 8 days ago
view post
Post
329
๐Ÿค– ICYMI: Yesterday, Hugging Face and OpenAI partnered to bring open source GPT to the public. This is a Big Deal in "AI world".

0. Common ground setting: OpenAI is the ChatGPT people. An โ€œopen sourceโ€ model is one whose weights are available โ€” that means the model can be โ€œyoursโ€.
1. You donโ€™t have to interact with the company directly, nor give them your interactions, to use the system. The company can't "surveil" you.
2. You can evaluate the unique contributions of their SOTA model much more rigorously than you can when there are collections of models+code behind a closed API. You can find out specifically what the model can and can't do.
3. And you can directly customize it for whatever you'd like. Fine-tuning, wherein you give the model data that's tailored to your use cases and train it some more on that data, is trivial* when you have the model weights.
*Provided you have the compute.
4. You can directly benchmark whatever you'd like. Biases? Energy usage? Strengths/weaknesses? Go for it. You wants it you gots it--this transparency helps people understand SOTA *in general*, not just for this model, but points to, e.g., what's going on with closed Google models as well.
5. One of the most powerful things about "openness" that I've learned is that it cultivates ecosystems of collaborators building on top of one another's brilliance to make systems that are significantly better than they would be if created in isolation.
But, caveat wrt my own philosophy...
6. I do not take it as a given that advancing LLMs is good, and have a lot more to say wrt where I think innovation should focus more. For example, a focus on *data* -- curation, measurement, consent, credit, compensation, safety -- would deeply improve technology for everyone.
7. The transparency this release provides is massive for people who want to *learn* about LLMs. For the next generation of technologists to advance over the current, they MUST be able to learn about what's happening now. (cont...)
  • 1 reply
ยท
fdaudensย 
posted an update 9 days ago
view post
Post
2568
Well, it took just 2 hours for openai/gpt-oss-120b to hit #1 on Hugging Face. Donโ€™t remember seeing anything rise that fast!
  • 1 reply
ยท
megย 
posted an update 14 days ago
view post
Post
435
๐Ÿค– ๐Ÿ‘พ Thanks so much to BBC News and the stellar Suranjana Tewari for having me on to talk about US <โ€”> China relationship in AI, and what it means for AI ethics.
BrigitteTousiย 
posted an update 20 days ago
view post
Post
528
This is what Hugging Face is all about. We want everyone, hobbyists, researchers and industry alike, to be able to contribute to AI because everyone is affected by it. Kudos to HF's @irenesolaiman for spreading the word!๐Ÿ”ฅ๐Ÿค—
anditoย 
posted an update 22 days ago
view post
Post
2750
Many VLMs claim to process hours of video. But can they follow the story?๐Ÿค”
Today, we introduce TimeScope: The benchmark that separates true temporal understanding from marketing hype. Let's see how much VLMs really understand!โณ

We test three skills that matter for real-world use:
๐Ÿ”Ž Localized Retrieval: Find a specific action.
๐Ÿงฉ Information Synthesis: Piece together scattered clues.
๐Ÿƒ Fine-Grained Perception: Analyze detailed motion (e.g., count how many times a person swings an axe).

The results are in, and they're revealing. Only Gemini 2.5 pro handles 1-hour-long videos.
Performance drops sharply with duration, proving that long video understanding is still challenging. We've found the breaking pointsโ€”now the community can start fixing them.๐Ÿ“ˆ

Want to learn more? TimeScope is 100% open-source. Benchmark your model and help us build the next generation of video AI.

๐Ÿ“– Blog:
https://huggingface.co/blog/timescope-video-lmm-benchmark
๐Ÿ‘ฉโ€๐Ÿ’ป Leaderboard & Demo: Apollo-LMMs/TimeScope
๐Ÿ“Š Dataset: Apollo-LMMs/TimeScope
โš™๏ธ Eval Code: https://github.com/EvolvingLMMs-Lab/lmms-eval
eliebakย 
posted an update 24 days ago
view post
Post
4592
Kimi K2 tech report is full of gems as always. Here are my notes on it:

> MuonClip: Pretty crazy how after 70k the training stabilizes and the QK-clip is basically inactive. There is also no loss in perf with QK-clip which is not trivial at all (at small scale but with aggressive threshold). Also a cool explanation of why muon makes the logit explode in appendix E (tl;dr is that muon makes the singular value of the update matrix higher)
> Sparsity scaling laws to justify their ratio, they have a very solid training infra that allows the model to be trained at this sparsity level, they could have increased even more but as sparsity increases the training becomes less efficient.
> They diminish the number of attention heads to make it more efficient for long context since attention heads are a big bottleneck for long context. They also remove 2 of the 3 "first dense" layers in the dsv3 arch.

With the sparsity and attention heads (divided by 2) they achieve 83% increased flops compared to deepseek v3 arch at 128k.

> Data: Rephrasing is KEY. They do a lot more synthetic data generation and rephrase their corpus to have different styles, for longer documents they do it by chunk. I'm (half) surprised by the fact that ONLY 1 epoch (assuming same number of training tokens I think?) of data rephrased 10 times has better accuracy than 10 epochs of the same data rephrased once.
> They do rewriting for Math and Knowledge, for Math they apply the ShallowMath recipe and instruct the model to rephrase in a "learning note" style
> They talk about diversity and probably have some internal stuff/eval to test that, as always still a bit unclear for me how to properly measure that.

The infra is also very nice, quick summary:
> PP=16 (1F1B schedule, a bit custom), EP=16, zero1
> No FP8 computation but for storage of specific layers, selective recomputation for inexpensive block, activation offloading to CPU
ariG23498ย 
posted an update 25 days ago
fdaudensย 
posted an update 27 days ago
view post
Post
2190
AudioRAG is becoming real! Just built a demo with ColQwen-Omni that does semantic search on raw audio, no transcription needed.

Drop in a podcast, ask your question, and it finds the exact chunks where it happens. You can also get a written answer.

Whatโ€™s exciting: it skips transcription, making it faster and better at capturing emotion, ambient sound, and tone, surfacing results text search would miss.

- Demo: fdaudens/colqwen-omni-demo
- Blog post from ColQwen team: https://huggingface.co/blog/manu/colqwen-omni-omnimodal-retrieval
  • 1 reply
ยท
fdaudensย 
posted an update about 1 month ago
view post
Post
2546
You might not have heard of Moonshot AI โ€” but within 24 hours, their new model Kimi K2 shot to the top of Hugging Faceโ€™s trending leaderboard.

Soโ€ฆ who are they, and why does it matter?

Had a lot of fun co-writing this blog post with @xianbao , with key insights translated from Chinese, to unpack how this startup built a model that outperforms GPT-4.1, Claude Opus, and DeepSeek V3 on several major benchmarks.

๐Ÿงต A few standout facts:

1. From zero to $3.3B in 18 months:
Founded in March 2023, Moonshot is now backed by Alibaba, Tencent, Meituan, and HongShan.

2. A CEO who thinks from the end:
Yang Zhilin (31) previously worked at Meta AI, Google Brain, and Carnegie Mellon. His vision? Nothing less than AGI โ€” still a rare ambition among Chinese AI labs.

3. A trillion-parameter model thatโ€™s surprisingly efficient:
Kimi K2 uses a mixture-of-experts architecture (32B active params per inference) and dominates on coding/math benchmarks.

4. The secret weapon: Muon optimizer:
A new training method that doubles efficiency, cuts memory in half, and ran 15.5T tokens with zero failures. Big implications.

Most importantly, their move from closed to open source signals a broader shift in Chinaโ€™s AI scene โ€” following Baiduโ€™s pivot. But as Yang puts it: โ€œUsers are the only real leaderboard.โ€

๐Ÿ‘‡ Check out the full post to explore what Kimi K2 can do, how to try it, and why it matters for the future of open-source LLMs:
https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained
fdaudensย 
posted an update about 1 month ago
view post
Post
300
AI is reshaping everythingโ€”how we work, how we feel, even how nations compete.

Todayโ€™s reads cut across power, emotion, and disruption.

Hereโ€™s what stood out and why it matters ๐Ÿ‘‡

AI might โ€œsolveโ€ loneliness, but this could be a problem, as the discomfort of loneliness shapes us in important ways. ๐Ÿ’” https://t.co/k2Q9le6G0P

A new study warns of significant risks in using AI therapy chatbots, highlighting issues like stigmatization and inappropriate responses. ๐Ÿค– https://t.co/EFyW0RbYVl

AI is already showing signs of slashing job openings in the UK, particularly in roles exposed to the technology, suggesting a labor market slowdown. ๐Ÿ“‰ https://t.co/hhs0BbqIMa

AI firms like OpenAI are poaching Wall Street quants with massive paydays, shifting the talent landscape for building artificial general intelligence. ๐Ÿ’ฐ https://www.businessinsider.com/ai-talent-openai-wall-street-quant-trading-firms-2025-7

Speaking of which: Nvidia CEO Jensen Huang disagrees with Anthropic CEO Dario Amodei on whether AI will create more jobsโ€”or trigger a โ€œwhite-collar apocalypse.โ€ Huang believes AI will create vastly more, and better, jobs. โš”๏ธ https://t.co/YHWhY7qvSq

Can Nvidia convince governments to pay for โ€œsovereign AIโ€? Politicians are warming to the idea of national AI systems, but it might not reduce dependence on US tech. ๐ŸŒ https://t.co/htQDzJAIDu
eliebakย 
updated a Space about 1 month ago
anditoย 
posted an update about 1 month ago
view post
Post
3980
๐Ÿง ๐Ÿ‘๏ธ Can AI visualize solutions?

Humans often solve visual problems by sketching ideas in our minds. What if Vision-Language Models (VLMs) could do something similar, not by generating full images, but by using internal โ€œmental sketchesโ€?

Thatโ€™s the idea behind Mirage, a new framework that empowers VLMs to reason using latent visual tokens. Instead of just thinking in words, Mirage mixes in abstract visual representations that help the model solve complex tasks.

These aren't photorealistic images. They're compact, internal representations optimized purely to support reasoning.

๐Ÿ”ง Mirage is trained in two phases:

1) Grounding: It learns to produce latent tokens anchored in real images.
2) Refinement: The model drops the images and learns to generate visual tokens on its own.

๐Ÿ“ˆ And yes, it works!
On challenging benchmarks like Visual Spatial Planning, Jigsaw puzzles, and Spatial Attention Tasks, Mirage clearly outperforms GPT-4o and other strong baselines.
Smart sketches > empty words.

By mimicking the way humans visualize solutions, Mirage gives AI a new kind of imagination, one thatโ€™s faster, more efficient, and more human-like.
Kudos to the teams at UMass Amherst and MIT behind this exciting work.
Check the paper: Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (2506.17218)
ยท
fdaudensย 
posted an update about 2 months ago
view post
Post
3343
Three big AI copyright updates this week alone. Tracking it all is getting almost impossible!

Thatโ€™s why @BrigitteTousi and I built this interactive tracker to keep you up to date fdaudens/ai-copyright-lawsuits

(Prototyped in minutes with DeepSite!)