Florent Daudens's picture

Florent Daudens

fdaudens

AI & ML interests

AI & Journalism

Recent Activity

updated a Space about 1 hour ago
fdaudens/podcast-jobs
liked a model about 2 hours ago
Lightricks/LTX-Video
View all activity

Organizations

Hugging Face's profile picture Hugging Face OSS Metrics's profile picture Hugging Face Smol Models Research's profile picture ZeroGPU Explorers's profile picture LeRobot's profile picture Journalists on Hugging Face's profile picture Major TOM's profile picture MLX Community's profile picture Social Post Explorers's profile picture Projet Spinoza's profile picture Dev Mode Explorers's profile picture Hugging Face for Legal's profile picture Hugging Face Discord Community's profile picture Big Science Social Impact Evaluation for Bias and Stereotypes's profile picture Dataset Tools's profile picture Hugging Face Science's profile picture Coordination Nationale pour l'IA's profile picture Data Is Better Together Contributor's profile picture Sandbox's profile picture Open R1's profile picture

fdaudens's activity

posted an update 1 day ago
view post
Post
535
Hey! I built an AI Agent to query the FOIA API for a workshop at the Hacks/Hackers Summit in Baltimore and you can do it too!

Itโ€™s a quick proof of concept to demo what agents can do, how to design workflows, and how to approach the coding side. TWant a fun project to learn how AI agents work? I built one that queries the FOIA API โ€” and you can too!

It's a quick proof of concept I did for a workshop at the Hacks/Hackers Summit in Baltimore, demonstrating what agents can do, how to design workflows, and approaches to coding them.

- Slides https://docs.google.com/presentation/d/1lbf5K0yi213N7uxGnVKJdGWq2i0GayWj4vIcLkVlwD8/edit?usp=sharing
- Colab notebook https://colab.research.google.com/drive/1iw0qZyTni_6BcK0jj1x6gTfjm85NlaGv
- Gradio app: https://huggingface.co/spaces/JournalistsonHF/foia-agent
- MCP version to plug into Claude, Cursor, etc: https://huggingface.co/spaces/JournalistsonHF/foia-mcp-tools

Feel free to use the Gradio app for real FOIA requests, but also to improve it (I'm far from being a good coder) or adapt it for other countries.

And shout-out to everyone who powered through the workshop! ๐Ÿ˜…
  • 1 reply
ยท
posted an update 11 days ago
view post
Post
3098
Forget everything you know about transcription models - NVIDIA's parakeet-tdt-0.6b-v2 changed the game for me!

Just tested it with Steve Jobs' Stanford speech and was speechless (pun intended). The video isnโ€™t sped up.

3 things that floored me:
- Transcription took just 10 seconds for a 15-min file
- Got a CSV with perfect timestamps, punctuation & capitalization
- Stunning accuracy (correctly captured "Reed College" and other specifics)

NVIDIA also released a demo where you can click any transcribed segment to play it instantly.

The improvement is significant: number 1 on the ASR Leaderboard, 6% error rate (best in class) with complete commercial freedom (cc-by-4.0 license).

Time to update those Whisper pipelines! H/t @Steveeeeeeen for the finding!

Model: nvidia/parakeet-tdt-0.6b-v2
Demo: nvidia/parakeet-tdt-0.6b-v2
ASR Leaderboard: hf-audio/open_asr_leaderboard
  • 1 reply
ยท
reacted to abidlabs's post with โค๏ธ 12 days ago
view post
Post
3914
HOW TO ADD MCP SUPPORT TO ANY ๐Ÿค— SPACE

Gradio now supports MCP! If you want to convert an existing Space, like this one hexgrad/Kokoro-TTS, so that you can use it with Claude Desktop / Cursor / Cline / TinyAgents / or any LLM that supports MCP, here's all you need to do:

1. Duplicate the Space (in the Settings Tab)
2. Upgrade the Gradio sdk_version to 5.28 (in the README.md)
3. Set mcp_server=True in launch()
4. (Optionally) add docstrings to the function so that the LLM knows how to use it, like this:

def generate(text, speed=1):
    """
    Convert text to speech audio.

    Parameters:
        text (str): The input text to be converted to speech.
        speed (float, optional): Playback speed of the generated speech.


That's it! Now your LLM will be able to talk to you ๐Ÿคฏ
posted an update 12 days ago
view post
Post
540
I just gave my chatbots a massive upgrade: they can now generate audio from text, modify images โ€” you name it. Hereโ€™s how:

The Gradio team shipped MCP support. That means you can plug any AI app built with it into Claude or Cursor using the Model Context Protocol (MCP) โ€” think of it like a USB port for LLMs.

I put it to the test:
- Whipped up a quick text-to-speech app with Kokoro on HF (with an LLM riding shotgun, naturally)
- Added "mcp_server=True" in the code
- Connected it to Claude

Now I can generate audio from any text. The possibilities are next-level: you can potentially plug any of the 500K+ AI apps on Hugging Face to your favorite LLM.

Is this the new UI for AI?

- My tts app (feel free to use/duplicate it): fdaudens/kokoro-mcp
- Blog post: https://huggingface.co/blog/gradio-mcp
posted an update 13 days ago
view post
Post
1779
Want to know which AI models are least likely to hallucinate โ€” and how to keep yours from spiking hallucinations by 20%?

A new benchmark called Phare, by Giskard, tested leading models across multiple languages, revealing three key findings:

1๏ธโƒฃ Popular models aren't necessarily factual. Some models ranking highest in user satisfaction benchmarks like LMArena are actually more prone to hallucination.

2๏ธโƒฃ The way you ask matters - a lot. When users present claims confidently ("My teacher said..."), models are 15% less likely to correct misinformation vs. neutral framing ("I heard...").

3๏ธโƒฃ Telling models to "be concise" can increase hallucination by up to 20%.

What's also cool is that the full dataset is public - use them to test your own models or dive deeper into the results! H/t @davidberenstein1957 for the link.

- Study: https://www.giskard.ai/knowledge/good-answers-are-not-necessarily-factual-answers-an-analysis-of-hallucination-in-leading-llms
- Leaderboard: https://phare.giskard.ai/
- Dataset: giskardai/phare
posted an update 20 days ago
reacted to yjernite's post with ๐Ÿ”ฅ 27 days ago
view post
Post
3207
Today in Privacy & AI Tooling - introducing a nifty new tool to examine where data goes in open-source apps on ๐Ÿค—

HF Spaces have tons (100Ks!) of cool demos leveraging or examining AI systems - and because most of them are OSS we can see exactly how they handle user data ๐Ÿ“š๐Ÿ”

That requires actually reading the code though, which isn't always easy or quick! Good news: code LMs have gotten pretty good at automatic review, so we can offload some of the work - here I'm using Qwen/Qwen2.5-Coder-32B-Instruct to generate reports and it works pretty OK ๐Ÿ™Œ

The app works in three stages:
1. Download all code files
2. Use the Code LM to generate a detailed report pointing to code where data is transferred/(AI-)processed (screen 1)
3. Summarize the app's main functionality and data journeys (screen 2)
4. Build a Privacy TLDR with those inputs

It comes with a bunch of pre-reviewed apps/Spaces, great to see how many process data locally or through (private) HF endpoints ๐Ÿค—

Note that this is a POC, lots of exciting work to do to make it more robust, so:
- try it: yjernite/space-privacy
- reach out to collab: yjernite/space-privacy
replied to their post 27 days ago
posted an update 27 days ago
view post
Post
1580
Just tested something this morning that feels kind of game-changing for how we publish, discover, and consume news with AI: connecting Claude directly to the New York Times through MCP.

Picture this: You ask Claude about a topic, and it instantly pulls verified and trusted NYT content โ€” no more guessing if the info is accurate.

The cool part? Publishers stay in control of what they share via API, and users get fast, reliable access through the AI tools they already use. Instead of scraping random stuff off the web, we get a future where publishers actively shape how their journalism shows up in AI.

Itโ€™s still a bit technical to set up right now, but this could get super simple soon โ€” like installing apps on your phone, but for your chatbot. And you keep the brand connection, too.

Not saying it solves everything, but itโ€™s definitely a new way to distribute content โ€” and maybe even find some fresh value in the middle of this whole news + AI shakeup. Early movers will have a head start.

Curious what folks think โ€” could MCPs be a real opportunity for journalism?
  • 1 reply
ยท
reacted to merve's post with ๐Ÿ”ฅ 29 days ago
view post
Post
4453
sooo many open AI releases past week, let's summarize! ๐Ÿค—
merve/april-11-releases-67fcd78be33d241c0977b9d2

multimodal
> Moonshot AI released Kimi VL Thinking, first working open-source multimodal reasoning model and Kimi VL Instruct, both 16B MoEs with 3B active params (OS)
> InternVL3 released based on Qwen2.5VL, 7 ckpts with various sizes (1B to 78B)

LLMs
> NVIDIA released Llama-3_1-Nemotron-Ultra-253B-v1 an LLM built on Llama 405B for reasoning, chat and tool use
> Agentica released DeepCoder-14B-Preview, fine-tuned version of DeepSeek-R1-Distilled-Qwen-14B on problem-test pairs, along with the compiled dataset
> Zyphra/ZR1-1.5B is a new small reasoning LLM built on R1-Distill-1.5B (OS)
> Skywork-OR1-32B-Preview is a new reasoning model by Skywork

Image Generation
> HiDream releases three new models, HiDream I1 Dev, I1 Full, and I1 fast for image generation (OS)

*OS ones have Apache 2.0 or MIT licenses
ยท
posted an update about 1 month ago
view post
Post
2148
Want AI that truly understands your country's culture? Public institutions are sitting on the next AI revolution - and here's the practical guide to unlock it.

I've had fascinating conversations recently about sovereign AI, with people trying to solve this recurring question: "How do we build AI that truly understands our culture?"

This guide by @evijit and @yjernite brings lots of insights about this question. It's not just about throwing data at models. It's about partnering cultural expertise with tech infrastructure in ways we're just starting to figure out.

An example? The National Library of Norway already has 150+ AI models on Hugging Face. They're not just digitizing books - they're building AI that thinks in Norwegian, understands Norwegian values, and serves Norwegian citizens.

This is sovereign AI in practice: technology that understands your culture, values, and languages.

Especially loved the practical examples on how to do this:
- Real examples from museums, libraries, and government agencies
- How to convert complex documents (PDFs, PowerPoints) into ML-ready formats
- Code templates for processing public data
- Technical recipes for sharing datasets on open platforms

The stakes? Citizens' ability to leverage their collective digital intelligence.

The technology is ready. The infrastructure exists. The guide shows exactly how to use it. What's needed is your cultural expertise to shape these tools.

Check it out: https://huggingface.co/blog/evijit/public-org-data-ai

P.s.: Building cool projects in a public institution? Share them in the comments for others to learn from!
posted an update about 1 month ago
view post
Post
2842
Do chatbots lie about Cรฉline Dion? We now have answers, not speculation.

Ai2 just released OLMoTrace and it's a game-changer for transparency. You can literally see where an AI's responses come from in its training data - in real time.

The demo shows results about Cรฉline. So I tried it out myself! Watch what happens in the video.

For journalists, researchers studying hallucinations and anyone who needs to trust their AI, this is like getting X-ray vision into AI systems. When the model made claims, I could instantly verify them against original sources. When it hallucinated, I could see why.

You can finally 1) understand how LLMs actually work and 2) verify if what they're saying is true. No more blind trust.

This pushes the open data movement to the next level.

๐Ÿ‘‰ Blog post: https://allenai.org/blog/olmotrace
๐Ÿ‘‰ Paper: https://www.datocms-assets.com/64837/1743890415-olmotrace.pdf

P.S.: A word of caution: never use a chatbot as a knowledge base. It's not Google. Better use it with a connection to the internet.
  • 1 reply
ยท
posted an update about 1 month ago
view post
Post
4122
๐ŸŽจ Designers, meet OmniSVG! This new model helps you create professional vector graphics from text/images, generate editable SVGs from icons to detailed characters, convert rasters to vectors, maintain style consistency with references, and integrate into your workflow.

@OmniSVG
  • 2 replies
ยท
posted an update about 1 month ago
view post
Post
3647
I read the 456-page AI Index report so you don't have to (kidding). The wild part? While AI gets ridiculously more accessible, the power gap is actually widening:

1๏ธโƒฃ The democratization of AI capabilities is accelerating rapidly:
- The gap between open and closed models is basically closed: difference in benchmarks like MMLU and HumanEval shrunk to just 1.7% in 2024
- The cost to run GPT-3.5-level performance dropped 280x in 2 years
- Model size is shrinking while maintaining performance - Phi-3-mini hitting 60%+ MMLU at fraction of parameters of early models like PaLM

2๏ธโƒฃ But we're seeing concerning divides deepening:
- Geographic: US private investment ($109B) dwarfs everyone else - 12x China's $9.3B
- Research concentration: US and China dominate highly-cited papers (50 and 34 respectively in 2023), while next closest is only 7
- Gender: Major gaps in AI skill penetration rates - US shows 2.39 vs 1.71 male/female ratio

The tech is getting more accessible but the benefits aren't being distributed evenly. Worth thinking about as these tools become more central to the economy.

Give it a read - fascinating portrait of where AI is heading! https://hai-production.s3.amazonaws.com/files/hai_ai_index_report_2025.pdf
ยท
posted an update about 1 month ago
view post
Post
2389
See that purple banner on the Llama 4 models? It's Xet storage, and this is actually huge for anyone building with AI models. Let's geek out a little bit ๐Ÿค“

Current problem: AI models are massive files using Git LFS. But with models getting bigger and downloads exploding, we needed something better.
Xet lets you version large files like code, with compression and deduplication, all Git-compatible. That means less bandwidth, faster sharing, and smoother collaboration.

Real numbers: ~25% deduplication on Llama 4 models, hitting ~40% for finetunes.

Scale matters here - the Hub served 2B model downloads in 30 days, Llama models alone at 60M. The upcoming Llama 4 Behemoth has 2T parameters! Xet's chunk-based system was built exactly for this.

This is the kind of engineering that makes the next wave of large models actually usable. Kudos to the team! ๐Ÿงจ

Check out the models collection: meta-llama/llama-4-67f0c30d9fe03840bc9d0164
reacted to jeffboudier's post with ๐Ÿš€ about 1 month ago
view post
Post
2193
Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems ๐Ÿ‘‰ dell.huggingface.co
reacted to jsulz's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
3742
Huge week for xet-team as Llama 4 is the first major model on Hugging Face uploaded with Xet providing the backing! Every byte downloaded comes through our infrastructure.

Using Xet on Hugging Face is the fastest way to download and iterate on open source models and we've proved it with Llama 4 giving a boost of ~25% across all models.

We expect builders on the Hub to see even more improvements, helping power innovation across the community.

With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family. On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. The attached image shows a few selected models and how they perform on Xet.

Thanks to the meta-llama team for launching on Xet!
posted an update about 1 month ago
view post
Post
2544
"Am I going to be replaced by AI?" - Crucial question, but maybe we're asking the wrong one.

๐Ÿ“ˆ There's a statistic from my reads this week that stays with me: Tomer Cohen, LinkedIn's CPO, shares to Jeremy Kahn that 70% of skills used in most jobs will change by 2030. Not jobs disappearing, but transforming. And he calls out bad leadership: "If in one year's time, you are disappointed that your workforce is not 'AI native,' it is your fault."

๐Ÿ”„ Apparently, the Great Recalibration has begun. We're now heading into an era where AI is fundamentally redefining the nature of work itself, by forcing a complete reassessment of human value in the workplace, according to a piece in Fast Company. But it might be driven more by "the need for humans to change the way they work" than AI.

โšก The Washington Post draws a crucial parallel: We're facing an "AI shock" similar to manufacturing's "China shock" - but hitting knowledge workers. Especially entry-level, white-collar work could get automated. The key difference? "Winning the AI tech competition with other countries won't be enough. It's equally vital to win the battle to re-skill workers."

Digging into these big questions in this weekโ€™s AI in the News: https://fdaudens.substack.com/publish/posts/detail/160596301

Also, I'm curious: how are you keeping up with this pace of change? What strategies are working for you?
posted an update about 1 month ago
view post
Post
2266
Did we just drop personalized AI evaluation?! This tool auto-generates custom benchmarks on your docs to test which models are the best.

Most benchmarks test general capabilities, but what matters is how models handle your data and tasks. YourBench helps answer critical questions like:
- Do you really need a hundreds-of-billions-parameter model sledgehammer to crack a nut?
- Could a smaller, fine-tuned model work better?
- How well do different models understand your domain?

Some cool features:
๐Ÿ“š Generates custom benchmarks from your own documents (PDFs, Word, HTML)
๐ŸŽฏ Tests models on real tasks, not just general capabilities
๐Ÿ”„ Supports multiple models for different pipeline stages
๐Ÿง  Generate both single-hop and multi-hop questions
๐Ÿ” Evaluate top models and deploy leaderboards instantly
๐Ÿ’ฐ Full cost analysis to optimize for your budget
๐Ÿ› ๏ธ Fully configurable via a single YAML file

26 SOTA models tested for question generation. Interesting finding: Qwen2.5 32B leads in question diversity, while smaller Qwen models and Gemini 2.0 Flash offer great value for cost.

You can also run it locally on any models you want.

I'm impressed. Try it out: yourbench/demo
posted an update about 1 month ago
view post
Post
2109
๐Ÿ”ฅ DeepSeek vibe coding with DeepSite is going viral with awesome projects!

From games to stunning visualizations, 7 wild examples:

๐Ÿ“บ AI TV with custom channels and animations https://x.com/_akhaliq/status/1905747381951545647

๐Ÿš€ Earth to Moon spacecraft journey visualization
Watch this incredible Three.js space simulation with zero external assets:
https://x.com/_akhaliq/status/1905836902533451999

๐Ÿ’ฃ Minesweeper in 2.5 minutes! Built & deployed instantly on DeepSite. Zero setup needed:
https://x.com/cholf5/status/1906031928937218334

๐ŸŽฎ Asked for Game of Life, got a masterpiece. Simple prompt, complex features. See it in action: https://x.com/pbeyssac/status/1906304454824992844

๐Ÿ’ซ One-shot anime website with perfect UI. DeepSite turned a simple request into a fully-functional anime site: https://x.com/risphereeditor/status/1905961725028913264

๐Ÿ“Š 10-minute World Indicators Dashboard. Just described what I wanted and got a full interactive dashboard! https://x.com/i/status/1906345214089785634

โœจ Ready to build without coding? Imagine it. Build it. Share it! enzostvs/deepsite