AI & ML interests

None defined yet.

fffiloni 
posted an update 1 day ago
view post
Post
1162
I made a Hugging Face Space for SCAIL-2 đŸ€—

Reference character + driving motion → animated result.

A simple demo to explore the paper’s core workflow with curated examples.

👉 fffiloni/SCAIL-2
fffiloni 
posted an update 3 days ago
view post
Post
768
⏱ Built a small Space for Visual Chronometer / Pulse of Motion.

Upload a video and estimate its Physical FPS: the frame rate implied by visual motion, independent of metadata.
Useful to inspect “chronometric hallucination” in generated videos: clips that look smooth, but move with the wrong physical time scale.

Try it here: fffiloni/Pulse-of-Motion
fffiloni 
posted an update 8 days ago
view post
Post
1528
A few weeks ago, @victor opened the door: coding agents can now ship Hugging Face Spaces autonomously.

I pulled on that thread.

As someone who builds and ships Gradio demos regularly, I didn’t just want to reproduce the loop. I wanted to see what happens when that loop is plugged into the whole Hugging Face stack.

The interesting part is not only that an agent can ship a Space.

It’s what happens when Space generation becomes a first-class Hugging Face workflow.

That became Agentic Space Factory.

More soon. đŸ€—
  • 1 reply
·
fffiloni 
posted an update about 2 months ago
view post
Post
3843
I built HF Radio on Hugging Face Spaces đŸ“»
fffiloni/HF-Radio

A live community radio for AI-generated songs, powered by tracks created with ACE-Step.

You can tune in, discover community-made songs in many languages, vote on what sounds good, and mark your real favorites as Bangers.

The more people listen, vote, and create, the better the station gets.

Under the hood, it connects a few Hugging Face pieces together:

Spaces for the live app, HF buckets for community tracks, OAuth for signed-in listeners, server-side streaming with ffmpeg, hourly playlist refreshes, moderation, jingles, and community feedback loops.

It’s not just a playlist.

It’s a shared taste experiment:
new songs get a shot every hour, and the community helps decide what deserves another spin.

Come listen.
Find weird gems.
Support the Bangers.
Shape the radio.

—> fffiloni/HF-Radio
fffiloni 
posted an update about 2 months ago
view post
Post
505
Great technical guide by Nico Martin on the Hugging Face blog, showing how to use Transformers.js inside a Chrome extension and run ONNX models from the Hub locally with WebGPU inside a Manifest V3 extension.

The interesting part: this is not just a chatbot in a side panel.

The article walks through the architecture behind a browser agent that can read open tabs, query webpages, search history, and highlight elements directly on the page — with models downloaded from the Hugging Face Hub, cached under the extension origin, and executed locally instead of being called through a remote API for every prompt.

A strong blueprint for building local-first web copilots, reading assistants, and AI-powered browsing workflows.

Article: https://huggingface.co/blog/transformersjs-chrome-extension
fffiloni 
posted an update about 2 months ago
view post
Post
346
I’ve been reading “What if AI systems weren’t chatbots?”
What if AI systems weren't chatbots? (2605.07896) 👀

The paper asks a simple but important question: what if the chatbot interface is not just a neutral wrapper around AI models, but part of the problem?

A chatbot can make a system feel more capable, more certain, and more “human” than it really is. That matters, because interfaces shape how we trust, use, and delegate to AI systems.

When everything becomes: ask → answer
we can lose sight of the actual workflow:
- parameters
- alternatives
- uncertainty
- intermediate steps
- failure modes
- human control

For creative AI especially — image, video, editing, animation — I’m not sure “chat” should always be the default interface.

Sometimes we need a conversation.
But often we need a canvas, a timeline, sliders, masks, previews, comparisons, and visible pipelines.

This is also why I find many open ML demos interesting: Spaces, Gradio apps, visual tools, small focused interfaces.

They often explore another direction — not just better assistants, but better tools. đŸ€—
  • 2 replies
·
fffiloni 
posted an update 2 months ago
fffiloni 
posted an update 2 months ago
view post
Post
1826
🚀 RB-Modulation is back on Hugging Face Spaces!

This is an older project that recently broke due to dependency changes, but it’s now fixed and running again ✅

👉 What’s fixed:
- GroundingDINO & LangSAM installation
- compatibility with recent environments
- GPU inference running smoothly again

👉 Try it here:
fffiloni/RB-Modulation

Feel free to give it a try again — feedback welcome!
fffiloni 
posted an update 3 months ago
view post
Post
3195
✹ PASD Magnify is back on Hugging Face Spaces

fffiloni/PASD

PASD isn’t recent, but still delivers strong results — worth restoring rather than replacing.

Getting it to run again wasn’t a simple dependency issue.
It relied on parts of diffusers that no longer exist, while moving to Gradio 6 forced a much newer HF stack — and I couldn’t modify the original source directly.

Recreating the old environment wasn’t practical.
So I patched the downloaded code at runtime before import and made it compatible with today’s stack.

That ended up being the only approach that held without forking or freezing everything to outdated versions.

If you’ve used it before (or are curious), feel free to give it another try.
fffiloni 
posted an update 3 months ago
view post
Post
2878
✅ Back up and running!

My TIGER app is now fully working again, with fixes and full compatibility with Gradio 6 🚀

It lets you:
- đŸŽ™ïž Separate multiple speakers from an audio file
- 🎬 Extract each speaker directly from a video
- 🎧 Split audio into dialog, music, and sound effects (DnR)
- đŸŽ„ Apply DnR separation directly on videos

All powered by lightweight TIGER models for fast and efficient speech separation.

Try it here 👉 fffiloni/TIGER-audio-extraction
fffiloni 
posted an update 3 months ago
view post
Post
2302
AniDoc is back 🎉

I’ve fixed the Space and brought it back to life:
- ✅ Working again after being broken for a while
- ✅ Updated to Gradio 6
- ✅ Compatible with ZeroGPU
- ✅ Output videos now preserve original resolution and FPS

I also added advanced controls so you can experiment more (tracking, seed, motion, sketch).

Try it here: fffiloni/AniDoc
fffiloni 
posted an update 4 months ago
view post
Post
4149
I brought DALL·E mini back to life đŸ€–đŸŽš

You can try it here:
fffiloni/dalle-mini-reboot

And I also built a batch version using Hugging Face Jobs (up to 50 images per prompt):
fffiloni/dalle-mini-via-jobs

The goal was to stay close to the original JAX/Flax pipeline, while integrating it with modern tooling (Gradio + Jobs).

It ended up being a fun way to revisit this model — still weird, still fun 😄
  • 4 replies
·
fffiloni 
posted an update 4 months ago
view post
Post
509
A clearer demo for TADA (now multilingual) 🔊🌍

I improved the public demo for TADA — a generative framework for speech modeling via text–acoustic dual alignment.

TADA models speech as a joint sequence of text tokens and acoustic tokens, using a transformer backbone to keep text and audio synchronized during generation.

The original demo already exposed these mechanisms, but the workflow made the pipeline hard to understand.

This updated demo makes the process clearer:

‱ load the model
‱ prepare a reference voice (optionally with transcript or Whisper auto-transcription)
‱ generate speech conditioned on that reference

It also adds multilingual support.

Presets are included for a few languages, but the model supports more:

English, French, Spanish, German, Arabic, Mandarin Chinese, Italian, Japanese, Polish, Portuguese

Feel free to try different voices, accents, or languages and see how the alignment behaves.

👉 fffiloni/tada-dual-alignment-tts-demo

Paper
TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment (2602.23068)
fffiloni 
posted an update over 1 year ago
fffiloni 
posted an update over 1 year ago
view post
Post
3608
Explain like i'm 5 the last take from @thomwolf on X about Dario's essay on DeepSeek:

—â€ș Open-source AI is like a big cookbook that everyone can read and improve. Instead of a few chefs keeping their recipes secret, anyone can cook, test, and invent new things.

If only one company controls AI, everything stops if they have a problem—like when the internet goes down. With open-source, many people can help, making sure it keeps running smoothly.

AI isn’t just a race between two countries; it’s a team effort around the world. By sharing, we move faster and create safer technology for everyone.
—
đŸ€—
fffiloni 
posted an update over 1 year ago
fffiloni 
posted an update almost 2 years ago
view post
Post
20040
Visionary Walter Murch (editor for Francis Ford Coppola), in 1999:

“ So let's suppose a technical apotheosis some time in the middle of the 21st century, when it somehow becomes possible for one person to make an entire feature film, with virtual actors. Would this be a good thing?

If the history of oil painting is any guide, the broadest answer would be yes, with the obvious caution to keep a wary eye on the destabilizing effect of following too intently a hermetically personal vision. One need only look at the unraveling of painting or classical music in the 20th century to see the risks.

Let's go even further, and force the issue to its ultimate conclusion by supposing the diabolical invention of a black box that could directly convert a single person's thoughts into a viewable cinematic reality. You would attach a series of electrodes to various points on your skull and simply think the film into existence.

And since we are time-traveling, let us present this hypothetical invention as a Faustian bargain to the future filmmakers of the 21st century. If this box were offered by some mysterious cloaked figure in exchange for your eternal soul, would you take it?

The kind of filmmakers who would accept, even leap, at the offer are driven by the desire to see their own vision on screen in as pure a form as possible. They accept present levels of collaboration as the evil necessary to achieve this vision. Alfred Hitchcock, I imagine, would be one of them, judging from his description of the creative process: "The film is already made in my head before we start shooting."”
—
Read "A Digital Cinema of the Mind? Could Be" by Walter Murch: https://archive.nytimes.com/www.nytimes.com/library/film/050299future-film.html

  • 1 reply
·
fffiloni 
posted an update about 2 years ago
view post
Post
19575
đŸ‡«đŸ‡·
Quel impact de l’IA sur les filiĂšres du cinĂ©ma, de l’audiovisuel et du jeu vidĂ©o?
Etude prospective Ă  destination des professionnels
— CNC & BearingPoint | 09/04/2024

Si l’Intelligence Artificielle (IA) est utilisĂ©e de longue date dans les secteurs du cinĂ©ma, de l’audiovisuel et du jeu vidĂ©o, les nouvelles applications de l’IA gĂ©nĂ©rative bousculent notre vision de ce dont est capable une machine et possĂšdent un potentiel de transformation inĂ©dit. Elles impressionnent par la qualitĂ© de leurs productions et suscitent par consĂ©quent de nombreux dĂ©bats, entre attentes et apprĂ©hensions.

Le CNC a donc dĂ©cider de lancer un nouvel Observatoire de l’IA Afin de mieux comprendre les usages de l’IA et ses impacts rĂ©els sur la filiĂšre de l’image. Dans le cadre de cet Observatoire, le CNC a souhaitĂ© dresser un premier Ă©tat des lieux Ă  travers la cartographie des usages actuels ou potentiels de l’IA Ă  chaque Ă©tape du processus de crĂ©ation et de diffusion d’une Ɠuvre, en identifiant les opportunitĂ©s et risques associĂ©s, notamment en termes de mĂ©tiers et d’emploi. Cette Ă©tude CNC / Bearing Point en a prĂ©sentĂ© les principaux enseignements le 6 mars, lors de la journĂ©e CNC « CrĂ©er, produire, diffuser Ă  l’heure de l’intelligence artificielle ».

Le CNC publie la version augmentĂ©e de la cartographie des usages de l’IA dans les filiĂšres du cinĂ©ma, de l’audiovisuel et du jeu vidĂ©o.

Lien vers la cartographie complĂšte: https://www.cnc.fr/documents/36995/2097582/Cartographie+des+usages+IA_rapport+complet.pdf/96532829-747e-b85e-c74b-af313072cab7?t=1712309387891
  • 4 replies
·
fffiloni 
updated a Space about 2 years ago
fffiloni 
posted an update over 2 years ago
view post
Post
"The principle of explainability of ai and its application in organizations"
Louis Vuarin, Véronique Steyer
—â€ș 📔 https://doi.org/10.3917/res.240.0179

ABSTRACT: The explainability of Artificial Intelligence (AI) is cited in the literature as a pillar of AI ethics, yet few studies explore its organizational reality. This study proposes to remedy this shortcoming, based on interviews with actors in charge of designing and implementing AI in 17 organizations. Our results highlight: the massive substitution of explainability by the emphasis on performance indicators; the substitution of the requirement of understanding by a requirement of accountability; and the ambiguous place of industry experts within design processes, where they are employed to validate the apparent coherence of ‘black-box’ algorithms rather than to open and understand them. In organizational practice, explainability thus appears sufficiently undefined to reconcile contradictory injunctions. Comparing prescriptions in the literature and practices in the field, we discuss the risk of crystallizing these organizational issues via the standardization of management tools used as part of (or instead of) AI explainability.

Vuarin, Louis, et VĂ©ronique Steyer. « Le principe d’explicabilitĂ© de l’IA et son application dans les organisations », RĂ©seaux, vol. 240, no. 4, 2023, pp. 179-210.

#ArtificialIntelligence #AIEthics #Explainability #Accountability