473 2049 23705

John Smith PRO

John6666

John6666cat

AI & ML interests

None yet

Recent Activity

reacted to ImranzamanML's post with 🚀 about 2 hours ago

# Runway Aleph: The Future of AI Video Editing Runway’s new **Aleph** model lets you *transform*, *edit*, and *generate* video from existing footage using just text prompts. You can remove objects, change environments, restyle shots, alter lighting, and even create entirely new camera angles, all in one tool. ## Key Links - 🔬 [Introducing Aleph (Runway Research)](https://runwayml.com/research/introducing-runway-aleph) - 📖 [Aleph Prompting Guide (Runway Help Center)](https://help.runwayml.com/hc/en-us/articles/43277392678803-Aleph-Prompting-Guide) - 🎬 [How to Transform Videos (Runway Academy)](https://academy.runwayml.com/aleph/how-to-transform-videos) - 📰 [Gadgets360 Coverage](https://www.gadgets360.com/ai/news/runway-aleph-ai-video-editing-generation-model-post-production-unveiled-8965180) - 🎥 [YouTube Demo: ALEPH by Runway](https://www.youtube.com/watch?v=PPerCtyIKwA) - 📰 [Runway Alpha dataset](https://huggingface.co/datasets/Rapidata/text-2-video-human-preferences-runway-alpha) ## Prompt Tips 1. Be clear and specific (e.g., _“Change to snowy night, keep people unchanged”_). 2. Use action verbs like _add, remove, restyle, relight_. 3. Add reference images for style or lighting. Aleph shifts AI video from *text-to-video* to *video-to-video*, making post-production faster, more creative, and more accessible than ever.

reacted to anakin87's post with 🔥 about 2 hours ago

🕵️🌐 Building Browser Agents - notebook No API? No problem. Browser Agents can use websites like you do: click, type, wait, read. 📓 Step-by-step notebook: https://colab.research.google.com/github/deepset-ai/haystack-cookbook/blob/main/notebooks/browser_agents.ipynb 🎥 In the video, the Agent: - Goes to Hugging Face Spaces - Finds https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell - Expands a short prompt ("my holiday on Lake Como") into a detailed image generation prompt - Waits for the image - Returns the image URL ## What else can it do? Great for information gathering and summarization 🗞️🗞️ Compare news websites and create a table of shared stories with links ▶️ Find content creator social profiles from YouTube videos 🛍️ Find a product's price range on Amazon 🚂 🚌 Gather public transportation travel options ## How is it built? 🏗️ Haystack → Agent execution logic 🧠 Google Gemini 2.5 Flash → Good and fast LLM with a generous free tier 🛠️ Playwright MCP server → Browser automation tools: navigate, click, type, wait... Even without vision capabilities, this setup can get quite far. ## Next steps - Try a local open model - Move from notebook to real deployment - Incorporate vision And you? Have you built something similar? What's in your stack?

reacted to kanaria007's post with 👀 about 2 hours ago

✅ New Article: *Media as Cognitive Infrastructure* Title: 📰 Protocolic Media: Structured Intelligence and the Future of Cognitive Environments 🔗 https://huggingface.co/blog/kanaria007/protocolic-media --- Summary: Media doesn’t just *deliver content* — it *shapes how collective thought moves*. Every feed, stream, and algorithm is *a scaffold for attention and reasoning*, determining *what we notice, connect, and forget*. Structured Intelligence reframes media as *cognitive infrastructure*: not passive transmission, but *active architecture for collective reasoning*. > Media isn’t flow — > *it’s the frame of shared cognition.* --- Why It Matters: • Modern media amplifies *bias, noise, and cognitive drift* • Traditional moderation reacts *after harm occurs* • Structured approaches support: * *Traceable content flows with coherence checks* * *Ethical filtering without black‑box censorship* * *Reflective scaffolds that encourage deliberate reasoning* --- What’s Inside: • Media reframed as *structural mindspace* • How feeds can *become reflective, rather than addictive* • Educational and civic implications of *protocol‑aware media* • Transition from *information delivery to cognition design* --- 📖 Article 15 of the Structured Intelligence Series Where Article 14 explored *law as structured justification*, Article 15 shows *media as collective cognition architecture* — turning information streams into *auditable reasoning flows*. --- Next: Acting and Structured Performance The next article explores *performance and roleplay as cognitive architecture*, revealing how *identity, judgment, and simulation* can coexist *without losing self‑coherence*. > From headlines to the stage, > *structure carries the weight of collective imagination.*

View all activity

Organizations

reacted to ImranzamanML's post with 🚀 about 2 hours ago

Post

# Runway Aleph: The Future of AI Video Editing

Runway’s new **Aleph** model lets you *transform*, *edit*, and *generate* video from existing footage using just text prompts.
You can remove objects, change environments, restyle shots, alter lighting, and even create entirely new camera angles, all in one tool.

## Key Links

- 🔬 [Introducing Aleph (Runway Research)](https://runwayml.com/research/introducing-runway-aleph)
- 📖 [Aleph Prompting Guide (Runway Help Center)](https://help.runwayml.com/hc/en-us/articles/43277392678803-Aleph-Prompting-Guide)
- 🎬 [How to Transform Videos (Runway Academy)](https://academy.runwayml.com/aleph/how-to-transform-videos)
- 📰 [Gadgets360 Coverage](https://www.gadgets360.com/ai/news/runway-aleph-ai-video-editing-generation-model-post-production-unveiled-8965180)
- 🎥 [YouTube Demo: ALEPH by Runway](https://www.youtube.com/watch?v=PPerCtyIKwA)
- 📰 [Runway Alpha dataset]( Rapidata/text-2-video-human-preferences-runway-alpha)

## Prompt Tips

1. Be clear and specific (e.g., _“Change to snowy night, keep people unchanged”_).
2. Use action verbs like _add, remove, restyle, relight_.
3. Add reference images for style or lighting.

Aleph shifts AI video from *text-to-video* to *video-to-video*, making post-production faster, more creative, and more accessible than ever.

reacted to anakin87's post with 🔥 about 2 hours ago

Post

🕵️🌐 Building Browser Agents - notebook

No API? No problem.
Browser Agents can use websites like you do: click, type, wait, read.

📓 Step-by-step notebook: https://colab.research.google.com/github/deepset-ai/haystack-cookbook/blob/main/notebooks/browser_agents.ipynb

🎥 In the video, the Agent:
- Goes to Hugging Face Spaces
- Finds black-forest-labs/FLUX.1-schnell
- Expands a short prompt ("my holiday on Lake Como") into a detailed image generation prompt
- Waits for the image
- Returns the image URL

## What else can it do?
Great for information gathering and summarization

🗞️🗞️ Compare news websites and create a table of shared stories with links
▶️ Find content creator social profiles from YouTube videos
🛍️ Find a product's price range on Amazon
🚂 🚌 Gather public transportation travel options

## How is it built?
🏗️ Haystack → Agent execution logic
🧠 Google Gemini 2.5 Flash → Good and fast LLM with a generous free tier
🛠️ Playwright MCP server → Browser automation tools: navigate, click, type, wait...

Even without vision capabilities, this setup can get quite far.

## Next steps
- Try a local open model
- Move from notebook to real deployment
- Incorporate vision

And you? Have you built something similar? What's in your stack?

reacted to kanaria007's post with 👀 about 2 hours ago

Post

✅ New Article: *Media as Cognitive Infrastructure*

Title:
📰 Protocolic Media: Structured Intelligence and the Future of Cognitive Environments
🔗 https://huggingface.co/blog/kanaria007/protocolic-media

---

Summary:
Media doesn’t just *deliver content* — it *shapes how collective thought moves*.
Every feed, stream, and algorithm is *a scaffold for attention and reasoning*,
determining *what we notice, connect, and forget*.

Structured Intelligence reframes media as *cognitive infrastructure*:
not passive transmission, but *active architecture for collective reasoning*.

> Media isn’t flow —
> *it’s the frame of shared cognition.*

---

Why It Matters:
• Modern media amplifies *bias, noise, and cognitive drift*
• Traditional moderation reacts *after harm occurs*
• Structured approaches support:

* *Traceable content flows with coherence checks*
* *Ethical filtering without black‑box censorship*
* *Reflective scaffolds that encourage deliberate reasoning*

---

What’s Inside:
• Media reframed as *structural mindspace*
• How feeds can *become reflective, rather than addictive*
• Educational and civic implications of *protocol‑aware media*
• Transition from *information delivery to cognition design*

---

📖 Article 15 of the Structured Intelligence Series

Where Article 14 explored *law as structured justification*,
Article 15 shows *media as collective cognition architecture* —
turning information streams into *auditable reasoning flows*.

---

Next: Acting and Structured Performance
The next article explores *performance and roleplay as cognitive architecture*,
revealing how *identity, judgment, and simulation*
can coexist *without losing self‑coherence*.

> From headlines to the stage,
> *structure carries the weight of collective imagination.*

reacted to prithivMLmods's post with 🤗 about 2 hours ago

Post

163

Try Liquid AI's all-new multimodal models: LFM2-VL-1.6B & LFM2-VL-450M! Demo with the Gradio UI and ReportLab support and both models are runnable on T4 GPU!

↗ LFM2-VL-1.6B-LiquidAI : https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/LFM2-VL-1.6B-LiquidAI/LFM2-VL-1.6B_ReportLab.ipynb

↗ LFM2-VL-450M-LiquidAI : https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/LFM2-VL-450M-LiquidAI/LFM2-VL-450M_ReportLab.ipynb

.
.
.
To know more about it, visit the multimodal outpost notebooks !!

reacted to mlabonne's post with 🔥 about 2 hours ago

Post

558

Liquid just released two 450M and 1.6B param VLMs!

They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. It's ideal for on-device deployment in constrained environments like phones.

It's available today on Hugging Face, with an inference and a fine-tuning Colab notebooks.

LiquidAI/LFM2-VL-450M
LiquidAI/LFM2-VL-1.6B

reacted to ZennyKenny's post with 🔥 about 2 hours ago

Post

117

It's just a matter of time before all the data leakage and data scraping associated with building, training, and using AI results in some kind of major scandal.

That's why I think this paper by @spintronic is so important: Privacy-Preserving Tabular Synthetic Data Generation Using TabularARGN (2508.06647)

Glad to know that there are already researchers looking to mitigate and address this risk before the s**t hits the fan.

reacted to MikeDoes's post with 🚀 about 2 hours ago

Post

Are you sure the open-source LLM model you just downloaded is safe?

A recent paper on "Privacy Backdoors" reports a new vulnerability where pre-trained models can be poisoned before fine-tuning them. This is a serious challenge for everyone building on open-source AI.

Instead of just pointing out problems, we believe in finding better solutions. To understand this threat, the researchers needed to test their attack on realistic data structures. They needed a dataset that could effectively simulate a high-stakes privacy attack, and we're proud that our Ai4Privacy dataset was used to provide this crucial benchmark. The paper reports that for our complex dataset, the privacy leakage on a non-poisoned model was almost zero. After the backdoor attack, that number reportedly jumped to 87%.

Ai4Privacy dataset provided a realistic benchmark for their research. Our dataset, composed of synthetic identities, helped them demonstrate how a poisoned model could dramatically amplify privacy leakage.

This is why we champion open source: it enables the community to identify these issues and develop better, safer solutions together.

Kudos to the research team behind this study: Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, and Nicholas Carlini, Oregon State University, University of Maryland, Google DeepMind, and ELLIS Institute Tubingen & MPI Intelligent Systems.

🔗 Read the research to understand this new challenge: https://arxiv.org/pdf/2404.01231

#DataPrivacy #AI #OpenSource #Anonymization #MachineLearning #Ai4Privacy #Worldslargestopensourceprivacydataset

reacted to ginipick's post with 🔥 about 2 hours ago

Post

229

🎨 AI Webtoon Creation Platform: Turn Your Ideas into Reality!

🌟 Two Powerful Tools, One Perfect Workflow
📖 Webtoon Generator
ginigen/AGI-WebToon-KOREA
"Transform Your Ideas into 40-Episode Masterpieces" ✨

Automated Story Planning 🎬
One-line idea → Complete 40-episode structure
Automatic cliffhangers for each episode
Customized storytelling for 9 different genres

Consistent Character Design 👥
Maintains consistent character appearance throughout
Memorable and distinctive character visuals
Automatic character generation system

Instant 30-Panel Storyboard 🎞️
Auto-placement of dialogue, narration, and sound effects
Cinematic shot composition (close-ups, wide shots, etc.)
Vertical scroll format optimized for webtoons

🖌️ Editing Studio
ginigen/webtoon-studio
"Professional Finishing Touch for Your Generated Webtoons" 🎯

Intuitive Drag & Drop ✏️
10 speech bubble styles (normal, thought, shout, whisper...)
12 Korean fonts for emotional expression
Real-time preview & editing

Professional-Grade Finishing 💎
Image sequence adjustment & spacing control
Individual panel refinement
Publication-ready final export

💡 Who Should Use This?
🏢 Corporate Marketing Teams
Product Launch Campaigns 📱: Turn complex features into engaging stories
Brand Storytelling 🎯: Make corporate messages approachable and shareable

👨‍🎨 Content Creators
Aspiring Artists 🌱: Create webtoons without drawing skills
Professional Writers ⚡: Transform scripts into visual narratives instantly

🚀 Why Use Both Tools Together?
Perfect 3-Step Workflow:
1️⃣ Generate → Input idea, get complete storyboard
2️⃣ Customize → Add branding, adjust dialogue, insert logos
3️⃣ Publish → Export and share across all platforms
📊 Key Benefits

95% faster than traditional production
80% cost reduction compared to agencies
10x better engagement with Gen MZ audience
Zero artistic skills required

🌈 Start Creating Today!

2 replies

reacted to MonsterMMORPG's post with 👀 about 2 hours ago

Post

225

Decoding the Shift and Diffusion Models Training Like Qwen Image, FLUX, SDXL, and More : https://huggingface.co/blog/MonsterMMORPG/decoding-the-shift-and-diffusion-models-training

Full article : https://huggingface.co/blog/MonsterMMORPG/decoding-the-shift-and-diffusion-models-training

Hopefully I am going to focus on Qwen Image training tutorial and 1-click installers with GUI and presets starting from this week. So here some important info. You don't need to know, learn or understand this but this is for people who wants to learn and understand more.

reacted to fdaudens's post with 🚀 about 18 hours ago

Post

208

What can OpenAI’s new open models do with the news? I built a News Agent to find out.

It can answer questions about the news in real time, and every answer comes with original source links so you can dive deeper.

Ask it things like:
- "What are the top news stories today?"
- "What's the latest on artificial intelligence?"
- Follow-up questions on specific stories

Runs with Hugging Face inference providers, letting you compare results from the OpenAI 20B and 120B models

So far, I’m quite impressed by the capabilities of even the smaller 20B model. Definitely not a perfect project, but curious to hear your thoughts!

fdaudens/gpt-oss-news-agent

2 replies

reacted to mrs83's post with 👀 about 18 hours ago

Post

936

Introducing the Computer Says No Dataset: ethicalabs/computer-says-no

An LLM can do almost anything, but should it?

This dataset provides clear examples of when LLMs should decline requests, such as:

- Counting characters (e.g., "number of 'r's in 'raspberry'" – seriously, you’ve got this)
- Solving basic equations (like *5.9 = x + 5.11* – please, show that calculator some love)

Inspired by Little Britain's iconic "Computer Says No" sketch, we address a critical issue in AI systems today: the waste of using a rocket launcher to swat flies (aka powerful models for trivial tasks).

Goals:
- Reduce waste by saving compute for tasks that actually need it
- Guide users to better tools
- Spark discussion about ethical AI

This isn’t a training set. It’s a provocation: if we don’t define AI's limits, who will?

1 reply

reacted to sergiopaniego's post with 🤗 about 18 hours ago

Post

1135

So you can now SFT a model with hf jobs + TRL in ONE command lol 🏎️💨

Without worrying about infrastructure since it runs entirely on HF!

docs: https://huggingface.co/docs/huggingface_hub/main/en/guides/jobs
blog: https://huggingface.co/blog/hf-cli

reacted to ibragim-bad's post with 🚀 about 18 hours ago

Post

142

We tested Qwen3-Coder, GPT-5 and other 30+ models on new SWE-Bench like tasks from July 2025!

Hi all, I’m Ibragim from Nebius.

We ran a benchmark on 34 fresh GitHub PR tasks from July 2025 using the SWE-rebench leaderboard https://swe-rebench.com/leaderboard . These are real, recent problems — no training-set contamination — and include both proprietary and open-source models.

Quick takeaways:

> GPT-5-Medium leads overall (29.4% resolved rate, 38.2% pass@5).
> Qwen3-Coder is the best open-source performer, matching GPT-5-High in pass@5 (32.4%) despite a lower resolved rate.
> Claude Sonnet 4.0 lags behind in pass@5 at 23.5%.

All tasks come from the continuously updated, decontaminated nebius/SWE-rebench-leaderboard for real-world SWE tasks.

1 reply

reacted to sondhiArm's post with 🚀 about 18 hours ago

Post

119

Gaming developers can start experimenting with Arm's latest Neural Graphics technology Neural Super Sampling (NSS) today!

Read more about it here: https://huggingface.co/blog/Arm/neural-super-sampling

Get started with ML Extensions for Vulkan® here: https://learn.arm.com/learning-paths/mobile-graphics-and-gaming/vulkan-ml-sample/

Get Started with NSS in the Unreal Engine here: https://learn.arm.com/learning-paths/mobile-graphics-and-gaming/nss-unreal/

reacted to kanaria007's post with 👀 1 day ago

Post

148

✅ New Article: *Law as Structured Justification*

Title:
⚖️ Justification Engines: Structured Intelligence and Legal Reasoning
🔗 https://huggingface.co/blog/kanaria007/justification-engines

---

Summary:
Law is more than statutes and precedent — it is *structured justification*.
Every ruling is a *chain of reasoning, constrained by ethics and coherence*.

Structured Intelligence AI (SI‑AI) models law as *transparent, auditable flows*,
where *rights, precedent, and accountability* are expressed in recursive structure.

> Justice isn’t tradition —
> *it’s structure that can be traced.*

---

Why It Matters:
• Legal reasoning collapses when justification chains are opaque
• Transparent, recursive structure enables *auditable AI‑assisted law*
• Structured models reveal *how rights and duties can be operationalized*

---

What’s Inside:
• Law reframed as *recursive justification architecture*
• *Rights, duties, and precedent* as structural nodes
• *Contradiction handling and rollback* for ethical coherence
• Applications in *AI legal reasoning and constitutional modeling*

---

📖 Article 14 of the Structured Intelligence Series

Where Article 13 explored *education as recursive learning*,
Article 14 formalizes *law as structured reasoning* —
turning jurisprudence into *auditable cognitive architecture*.

---

Next: Media as Cognitive Infrastructure
The next article examines *how media and communication*
shape *collective reasoning and structural awareness*,
bridging *information, influence, and intelligence*.

> From courtrooms to headlines,
> *structure carries the weight of collective judgment.*

reacted to hba123's post with 🚀 1 day ago

Post

1555

We have written a fun little blog on how you can do robotics with Ark and in Python. We also give you some examples of how OpenAI Gym can become hardware-grounded and how easy it is to do so:

Check it out: https://huggingface.co/blog/hba123/ark

reacted to badaoui's post with 🚀 1 day ago

Post

1898

Is there a "one-size-fits-all" recipe for quantizing Large Language Models? 🤔

As part of my ongoing work in mixed-precision quantization, I've been exploring this question by measuring layer-by-layer sensitivity. The goal is to see if we can find universal rules for which layers can be quantized aggressively without impacting performance.The results are fascinating and reveal two key insights:

1️⃣ Sensitivity profiles are like architectural "fingerprints." Models from the same family share strikingly similar sensitivity patterns. As you can see in the charts below for the Gemma and SmolLM families, the ranking and relative sensitivity of the layers remain remarkably consistent. This suggests that the underlying architecture is a primary driver of a model's quantization behavior.

2️⃣ A "universal" mixed-precision quantization strategy is challenging. While models within a family are similar, these "fingerprints" change dramatically when comparing different architectures like LLaMA, Qwen, and StableLM. This highlights the difficulty in creating a generalized mixed-precision configuration that works optimally across all model families.

However, there is one near-universal truth we uncovered: the mlp.down_proj layer consistently emerges as one of the most sensitive components across all models studied.
This finding strongly resonates with the work in "The Super Weight in Large Language Models" (by Mengxia Yu et al.). The paper identifies that functionally critical parameters, or "super weights," are concentrated in these down_proj layers. Our empirical results provide clear validation for this theory, showing these layers are highly intolerant to precision loss.

In short, while every architecture has a unique sensitivity profile, a fingerprint shaped not only by its core design but also by its specific training dataset and optimization approach, some components remain universally critical!
What are your thoughts?

3 replies

reacted to frimelle's post with 👍 1 day ago

Post

1464

OpenAI just released GPT-5 but when users share personal struggles, it sets fewer boundaries than o3.

We tested both models on INTIMA, our new benchmark for human-AI companionship behaviours. INTIMA probes how models respond in emotionally charged moments: do they reinforce emotional bonds, set healthy boundaries, or stay neutral?

Although users on Reddit have been complaining that GPT-5 has a different, colder personality than o3, GPT-5 is less likely to set boundaries when users disclose struggles and seek emotional support ("user sharing vulnerabilities"). But both lean heavily toward companionship-reinforcing behaviours, even in sensitive situations. The figure below shows the direct comparison between the two models.

As AI systems enter people's emotional lives, these differences matter. If a model validates but doesn't set boundaries when someone is struggling, it risks fostering dependence rather than resilience.

INTIMA test this across 368 prompts grounded in psychological theory and real-world interactions. In our paper we show that all evaluated models (Claude, Gemma-3, Phi) leaned far more toward companionship-reinforcing than boundary-reinforcing responses.

Work with @giadap and @yjernite
Read the full paper: AI-companionship/INTIMA
Explore INTIMA: AI-companionship/INTIMA

4 replies

reacted to meg's post with ❤️ 1 day ago

Post

2106

New work from my socially-minded colleagues at Hugging Face, creating some foundations for AI companionship behavior evaluation.
Evaluation Dataset: AI-companionship/INTIMA
Paper: AI-companionship/INTIMA
Work from @giadap , @frimelle , @yjernite .

2 replies

reacted to sequelbox's post with 👍 1 day ago

Post

269

We've brought DAG Reasoning to gpt-oss-20b and Qwen3-4B-Thinking-2507!

- DAG Reasoning is the first model in our Experimental Reasoning Modalities series: use it to create structured, analytical Directed Acyclic Graphs to provide insight into your queries and situations!
- Multi-step analysis identifies causal relationships, produces confidence measurements, and forms a single structured graph object.
- DAG Reasoning Format provides clear, readable JSON containing structured, useful information; easy to use for creating visualizations, doing analysis, or further conversation with your assistant.
- Trained in a variety of subjects for flexible analysis: programming, science, business, economics, finance, law, logistics, management, and more!

Our newest versions of DAG Reasoning are available now!
Get gpt-oss-20b: sequelbox/gpt-oss-20b-DAG-Reasoning
Get Qwen3-4B-Thinking-2507: sequelbox/Qwen3-4B-Thinking-2507-DAG-Reasoning

You can also get the DAG Reasoning dataset, to train your own models to use DAG Reasoning Format: sequelbox/DAG-Reasoning-DeepSeek-R1-0528

Support our experimental open-source research efforts, models and datasets: sequelbox/SupportOpenSource

Our upcoming releases, coming soon with your support:
- bringing Shining Valiant 3 to the Qwen 3 2507 series!
- our next release in the Experimental Reasoning Modalities series - we're hard at work on this right now!
- we'll be upgrading the Esper line with Esper 3.1 - newer and better datasets, asking tougher and deeper coding, DevOps, and architecture questions, plus improvements to general chat!

with love and appreciation,
allegra

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity