victor (Victor Mustar)

reacted to neph1's post with 🔥 4 days ago

Post

3121

I'm building a mmo-ish RPG with LLM agents that can (hopefully) complete player tasks, as an experiment. I've started documenting my progress here: https://huggingface.co/blog/neph1/rpg-llm-agents

Let me know if you want to see more of it.

4 replies

·

reacted to fdaudens's post with ❤️ 4 days ago

Post

2485

Well, it took just 2 hours for openai/gpt-oss-120b to hit #1 on Hugging Face. Don’t remember seeing anything rise that fast!

1 reply

·

reacted to JingzeShi's post with 🤗 5 days ago

Post

3935

Trainable selective sampling and sparse attention kernels are indispensable in the era of context engineering. We hope our work will be helpful to everyone! 🤗

Trainable Dynamic Mask Sparse Attention (2508.02124)

reacted to mrs83's post with 🔥 17 days ago

Post

1031

Hello Hugging Face Community! I'm excited to share a project I've been working on: SkinCancerViT, a multimodal Vision Transformer model for skin lesion analysis ethicalabs/SkinCancerViT

I've wrapped it in a Gradio app to make it easy to explore: ethicalabs/SkinCancerViTPredictor

This app is a research demonstration that combines dermatoscopic images with patient age and lesion localization to assist in classifying skin lesions.
You can either upload your own image and patient data for a prediction, or explore how the model performs on random samples from the marmal88/skin_cancer dataset.

I firmly believe that the only final, trustworthy diagnosis comes from medical professionals, and I am actively seeking medical institutions and researchers who might be interested in partnering with me to further explore the usage of this methodology, conducting further training with diverse datasets (ethically sourced and anonymized), performing extensive validation tests, and explore the possibility of running a federated fine-tuning simulation with https://flower.ai/

As a software engineer, I do not possess medical expertise and I am seeking collaboration with medical professionals and AI/ML researchers. You can find the project source code, which includes data preprocessing, model training and testing, at the following url: https://github.com/ethicalabs-ai/SkinCancerViT/tree/main

Thank you for your time and consideration!!!

3 replies

·

reacted to MohamedRashad's post with 🚀 22 days ago

Post

1813

For anyone who wants to try the new Voxtral models, you can do this from here:
MohamedRashad/Voxtral

Also you can find the transformers version of them here:
MohamedRashad/Voxtral-Mini-3B-2507-transformers
MohamedRashad/Voxtral-Small-24B-2507-transformers

reacted to DualityAI-RebekahBogdanoff's post with 🚀 25 days ago

Post

3210

📢 Generate your own data in simulation using two new free and customizable data-generating Scenarios on Duality's FalconCloud service.
🙌 These multi-class Scenarios are designed to target model weaknesses for our recent Kaggle competition, but they are free to anyone for non-commercial use! Just create a free account.

📸 Control object and camera posing
👉 Select random variable ranges
🖼️ Set post-processing effects
➕ and more to create a robust dataset for strong model training.

Access the 2 Scenarios here:
💠 https://falcon.duality.ai/secure/scenarios/edit/9e90e036-8af9-41e4-8af0-1343b8e8f467?utm_source=Kaggle&utm_medium=post&utm_campaign=competition_4
💠 https://falcon.duality.ai/secure/scenarios/edit/e3294c19-49d4-4f64-9ca8-8373876c2c94?utm_source=Kaggle&utm_medium=post&utm_campaign=competition_4

reacted to azettl's post with 🔥 about 1 month ago

Post

488

𝗚𝗿𝗮𝗱𝗶𝗼 𝗔𝗴𝗲𝗻𝘁𝘀 & 𝗠𝗖𝗣 𝗛𝗮𝗰𝗸𝗮𝘁𝗵𝗼𝗻 - 𝗙𝗶𝗻𝗮𝗹 𝗗𝗮𝘆

Submission deadline is in 10 minutes, so here's where Consilium ended up after a week of building.

What started as a simple idea, "𝘞𝘩𝘢𝘵 𝘪𝘧 𝘮𝘶𝘭𝘵𝘪𝘱𝘭𝘦 𝘈𝘐 𝘮𝘰𝘥𝘦𝘭𝘴 𝘤𝘰𝘶𝘭𝘥 𝘥𝘪𝘴𝘤𝘶𝘴𝘴 𝘢𝘯𝘥 𝘳𝘦𝘢𝘤𝘩 𝘤𝘰𝘯𝘴𝘦𝘯𝘴𝘶𝘴?" turned into a full multi-AI expert platform with live research integration.

𝗙𝗶𝗻𝗮𝗹 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀:
- Custom Gradio roundtable component with real-time speech bubbles
- MCP server mode
- Multiple AI models: Mistral Large, DeepSeek-R1, Meta-Llama-3.3-70B, QwQ-32B
- Research Agent with 5 sources: Web Search, Wikipedia, arXiv, GitHub, SEC EDGAR
- Different decision protocols and role assignments

𝗖𝘂𝗿𝗿𝗲𝗻𝘁 𝘀𝘁𝗮𝘁𝘂𝘀: 25 likes 👍 and some really good user feedback in the Discord channel. People are actually testing it on real decisions, which feels great. Also met some really awesome people during this week 🙌.

➡️ 𝗧𝗿𝘆 𝗶𝘁: Agents-MCP-Hackathon/consilium_mcp

Thanks to everyone who tested and gave feedback during the week ❤️. Win or lose, this was a fun deep dive into Gradio, smolagents, Hugging Face in general, SambaNova Systems and the Mistral AI API.

Also huge thanks to @victor 👏 who tweeted about the project and let me steal the video.

reacted to Nymbo's post with 👀 about 1 month ago

Post

2473

Anyone know how to reset Claude web's MCP config? I connected mine when the HF MCP first released with just the default example spaces added. I added lots of other MCP spaces but Claude.ai doesn't update the available tools... "Disconnecting" the HF integration does nothing, deleting it and adding it again does nothing.

Refreshing tools works fine in VS Code because I can manually restart it in mcp.json, but claude.ai has no such option. Anyone got any ideas?

3 replies

·

reacted to sequelbox's post with 🚀 about 1 month ago

Post

1849

The full Celestia 3 science-reasoning dataset is here!

- 91k high-quality synthetic science prompts answered by DeepSeek-R1-0528
- subjects include physics, biology, chemistry, computer science, Earth science, astronomy, and information theory
- one of the reasoning datasets powering the upcoming Shining Valiant 3 :) coming soon!

GET IT NOW, FOR EVERYONE: sequelbox/Celestia3-DeepSeek-R1-0528
SUPPORT OUR RELEASES: sequelbox/SupportOpenSource

with love,
allegra

reacted to burtenshaw's post with ❤️ about 1 month ago

Post

2856

Inference for generative ai models looks like a mine field, but there’s a simple protocol for picking the best inference:

🌍 95% of users >> If you’re using open (large) models and need fast online inference, then use Inference providers on auto mode, and let it choose the best provider for the model. https://huggingface.co/docs/inference-providers/index

👷 fine-tuners/ bespoke >> If you’ve got custom setups, use Inference Endpoints to define a configuration from AWS, Azure, GCP. https://endpoints.huggingface.co/

🦫 Locals >> If you’re trying to stretch everything you can out of a server or local machine, use Llama.cpp, Jan, LMStudio or vLLM. https://huggingface.co/settings/local-apps#local-apps

🪟 Browsers >> If you need open models running right here in the browser, use transformers.js. https://github.com/huggingface/transformers.js

Let me know what you’re using, and if you think it’s more complex than this.

reacted to Jaward's post with 👍 about 1 month ago

Post

2039

I played around with the new RXTX paper (XX^T) and was able to train nanogpt with 4x4 RXTX matmuls in both attention layer and optimizer🤕
It just works (well I had to add some guardrails) but still saves 5% of memory usage:
The Patch:
- Computes attention scores with a 4x4 blockwise RXTX matmuls (no pytorch dot prod)
- Handles arbitrary sequence lengths by padding to the nearest multiple of 4.
- An RXTX variant of shampoo with params reshaped into 4x4 blocks during each optimizer step.
- Uses 5% less ops
Code: https://github.com/Jaykef/ai-algorithms/blob/main/nanogpt-rxtx.ipynb
Paper: https://arxiv.org/pdf/2505.09814

reacted to arthurbresnu's post with 🚀 about 1 month ago

Post

2150

‼️Sentence Transformers v5.0 is out! The biggest update yet introduces Sparse Embedding models, encode methods improvements, Router module & much more. Sparse + Dense = 🔥 hybrid search performance!

1️⃣ Sparse Encoder Models - New support for sparse embeddings (30k+ dims, <1% non-zero)

* Full SPLADE, Inference-free SPLADE, CSR support
* 4 new modules, 12 losses, 9 evaluators
* Integration with elastic, opensearch-project, Qdrant, ibm-granite
* Decode interpretable embeddings
* Hybrid search integration

2️⃣ Enhanced Encode Methods

* encode_query & encode_document with auto prompts
* Direct device list passing to encode()
* Cleaner multi-processing

3️⃣ Router Module & Training

* Different paths for queries vs documents
* Custom learning rates per parameter group
* Composite loss logging
* Perfect for two-tower architectures

4️⃣ Documentation & Training

* New Training/Loss Overview docs
* 6 training example pages
* Search engine integration examples

Read the comprehensive blogpost about training sparse embedding models: https://huggingface.co/blog/train-sparse-encoder

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v5.0.0

What's next? We would love to hear from the community! What sparse encoder models would you like to see? And what new capabilities should Sentence Transformers handle - multimodal embeddings, late interaction models, or something else? Your feedback shapes our roadmap!

I'm incredibly excited to see the community explore sparse embeddings and hybrid search! The interpretability alone makes this a game-changer for understanding what your models are actually doing.

🙏 Thanks to @tomaarsen for this incredible opportunity!

reacted to asigalov61's post with 👍 about 1 month ago

Post

2311

Check out new symbolic music AI front end and CLI training app

https://webchatappai.github.io/midi-gen/

https://github.com/WebChatAppAi/Orpheus-Midi-Model-Maker

@Timzoid @Csplk @not-lain @victor @bartowski @John6666

2 replies

·

reacted to blaise-tk's post with ❤️ about 1 month ago

Post

4387

Today we launch Dione.

A few months ago it was just a wild idea I shared with @bygimenez , now it's real.

Dione (Beta) is here, the easiest way to discover and install open-source apps, especially AI ones.

Think of it as the Steam of open source. Installing open-source tools is often a mess. Dione fixes that.

Beautiful UI and workflow. Soon multi-platform, multilingual & fully open-source.
Users can even write and share their own installation scripts. This is just the beginning.

🚀 Join our exclusive Beta
→ https://getdione.app/beta/join

2 replies

·

reacted to blaise-tk's post with 🚀 about 1 month ago

Post

3105

A few months ago, I shared that I was building with @deeivihh something like "the Steam for open source apps"...

🚀 Today, I’m excited to announce that Dione is now open source and live in public beta!

Our mission is simple: make it easier to discover, use, and contribute to open source applications.

🔗 GitHub: https://github.com/dioneapp/dioneapp
💬 Join the community: https://discord.gg/JDFJp33vrM

Want to give it a try? I’d love your feedback! 👀

reacted to jsulz's post with 🚀 about 1 month ago

Post

4712

It's been a bit since I took a step back and looked at

xet-team progress to migrate Hugging Face from Git LFS to Xet, but every time I do it boggles the mind.

A month ago there were 5,500 users/orgs on Xet with 150K repos and 4PB. Today?
🤗 700,000 users/orgs
📈 350,000 repos
🚀 15PB

Meanwhile, our migrations have pushed throughput to numbers that are bonkers. In June, we hit upload speeds of 577Gb/s (crossing 500Gb/s for the first time).

These are hard numbers to put into context, but let's try:

The latest run of the Common Crawl from

commoncrawl was 471 TB.

We now have ~32 crawls stored in Xet. At peak upload speed we could move the latest crawl into Xet in about two hours.

We're moving to a new phase in the process, so stay tuned.

This shift in gears means it's also time to roll up our sleeves and look at all the bytes we have and the value we're adding to the community.

I already have some homework from @RichardErkhov to look at the dedupe across their uploads, and I'll be doing the same for other early adopters, big models/datasets, and frequent uploaders (looking at you @bartowski 👀)

Let me know if there's anything you're interested in; happy to dig in!

5 replies

·

reacted to yeonseok-zeticai's post with 🚀 about 2 months ago

Post

5793

💫 Next-Level On-Device AI Showdown

🪽 See It to Believe It, How QWEN4b works at On-device environment without expensive GPU Cloud server?
We’ve crafted a side-by-side demo video showcasing both Jan-Nano and QWEN 4B in action—no more wondering which model reigns supreme. Click play, compare their speeds, accuracy, and memory footprints, and decide which one fits your needs best!

👋 Why You Can’t Miss This
We are actively creating runnable sLLM environments for On-device AI. You can just build On-device AI apps within few hours.
Including Jan-Nano, QWEN4b, there are several sLLM models ready to be used on your AI application!.

🤑 Please feel free to use, because it is free to use!.

Ready to Compare?

Watch now, draw your own conclusions, and let us know which model you’d deploy in your next edge-AI project! 🌍💡

#OnDeviceAI #EdgeAI #AIShowdown #MLOptimization #DemoVideo #AIComparison

1 reply

·

reacted to merve's post with 🔥 about 2 months ago

Post

1938

stop using VLMs blindly ✋🏻

compare different VLM outputs on a huge variety of inputs (from reasoning to OCR!) 🔥 visionLMsftw/comparevlms

> has support for multiple VLMs: google/gemma-3-27b-it, Qwen/Qwen2.5-VL-7B-Instruct, Qwen/Qwen2.5-VL-32B-Instruct, meta-llama/Llama-4-Maverick-17B-128E-Instruct, HuggingFaceTB/SmolVLM2-2.2B-Instruct
> recommend us new models or inputs, we'll add 🫡

so far I figured out
> for fact-checks, you need a relatively bigger size (7B is ok!)
> Gemma 3 gets downgrade without pan and scan (especially for 📑)
> Qwen2.5VL-32B is very talkative, great for reasoning but not good for simple tasks 🗣️

2 replies

·

reacted to prithivMLmods's post with 🔥 about 2 months ago

Post

3931

The demo for the MonkeyOCR Recognition model, which adopts a Structure-Recognition-Relation (SRR) triplet paradigm & Nanonets-OCR-s a powerful, state-of-the-art image-to-markdown OCR model that goes far beyond traditional text extraction and other experimental document OCR models, is combined into a single space.

✦ Try the demo here : prithivMLmods/core-OCR
✦ Try Nanonets-OCR-s demo here : prithivMLmods/Multimodal-OCR

⤷ MonkeyOCR Recognition : echo840/MonkeyOCR
⤷ docscopeOCR-7B-050425-exp : prithivMLmods/docscopeOCR-7B-050425-exp
⤷ coreOCR-7B-050325-preview : prithivMLmods/coreOCR-7B-050325-preview
⤷ Nanonets-OCR-s : nanonets/Nanonets-OCR-s

⤷ Multimodal Implementations : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0

Also, include a sample OCR test using the VisionOCR-3B-061125 model and the Qwen2-VL-OCR-2B-Instruct model.
⤷ Blog : https://huggingface.co/blog/prithivMLmods/visionocr-3b-061125-vs-qwen2-vl-ocr-2b-instruct

To know more about it, visit the model card of the respective model. !!

reacted to cbensimon's post with 🔥 about 2 months ago

Post

3377

🚀 ZeroGPU now supports PyTorch native quantization via torchao

While it hasn’t been battle-tested yet, Int8WeightOnlyConfig is already working flawlessly in our tests.

Let us know if you run into any issues — and we’re excited to see what the community will build!

import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_

pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)

@spaces.GPU
def generate(prompt: str):
    return pipeline(prompt).images[0]

5 replies

·

Victor Mustar PRO

AI & ML interests

Recent Activity

Organizations

Victor Mustar PRO

AI & ML interests

Recent Activity

Organizations

victor's activity