Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

danielhanchen 
posted an update 3 days ago
anakin87 
posted an update 2 days ago
view post
Post
3179
📣 I just published a free course on Reinforcement Learning Environments for Language Models!

📌 COURSE: https://github.com/anakin87/llm-rl-environments-lil-course

Over the past year, we've seen a shift in LLM Post-Training.
Previously, Supervised Fine-Tuning was the most important part: making models imitate curated Question-Answer pairs.

Now we also have Reinforcement Learning with Verifiable Rewards. With techniques like GRPO, models can learn through trial and error in dynamic environments. They can climb to new heights without relying on expensively prepared data.


But what actually are these environments in practice❓ And how do you build them effectively❓

Fascinated by these concepts, I spent time exploring this space through experiments, post-training Small Language Models.
I've packaged everything I learned into this short course.


What you'll learn

🔹 Agents, Environments, and LLMs: how to map Reinforcement Learning concepts to the LLM domain
🔹 How to use Verifiers (open-source library by Prime Intellect) to build RL environments as software artifacts
🔹 Common patterns: How to build single-turn, multi-turn, and tool-use environments

🔹 Hands-on: turn a small language model (LFM2-2.6B by LiquidAI) into a Tic Tac Toe master
🔸 Build the game Environment
🔸 Use it to generate synthetic data for SFT warm-up
🔸 Group-based Reinforcement Learning

If you're interested in building "little worlds" where LLMs can learn, this course is for you.

---

🤗🕹️ Play against the trained model: anakin87/LFM2-2.6B-mr-tictactoe

📚 HF collection (datasets + models): https://huggingface.co/collections/anakin87/lfm2-26b-mr-tic-tac-toe
branikita 
posted an update 1 day ago
view post
Post
1867
We added multi-simulator support for our parallel gripper on the SO-ARM 100/101 arm. One URDF, one launch file, five physics engines: Gazebo, MuJoCo, Webots, CoppeliaSim, Isaac Sim.

The ROS2 package includes xacro descriptions, joint trajectory controllers, robot state publisher, and parameterized hardware interfaces for each simulator.

https://github.com/roboninecom/SO-ARM100-101-Parallel-Gripper
ArtelTaleb 
posted an update 1 day ago
view post
Post
1548
8VIEW AI Studio - Turn ideas into clean topology reference sheets for 3D/animation

What if you could generate production-ready topology references in seconds?
8VIEW AI Studio is built for character artists who care about clean edge flow, structure, and consistency - not just pretty images.

Upload an image or few and instantly get 8-view orthographic sheets with topology guides you can actually use in your workflow.

Why it matters

Most AI tools stop at visuals.
But real 3D work starts with topology.

-clean edge loops
-correct pole placement
-usable structure for sculpting & retopo

8VIEW bridges the gap between AI generation and real production workflows.

Features

8-view sheets
Front / back / side / 3⁄4 / Top / Bottom

Multilingual (9 languages)
English, Français, Español, Português, Italiano, हिन्दी, 日本語, العربية, 中文

Mobile-friendly - Create anywhere, anytime

Your key, your data - No storage. No tracking. Fully client-side

Built for

• Character artists (Blender, ZBrush, Maya)
• Students learning topology & retopology
• Concept artists needing fast structure
• Game artists building production assets

Workflow

Get a free Gemini API key (2 minutes)
Paste it into the app
Upload or describe your subject
Generate clean topology reference sheets

The goal

Less time guessing topology.
More time building clean models.

ArtelTaleb/8view-ai-studio


DavidAU 
posted an update 2 days ago
SeaWolf-AI 
posted an update 3 days ago
view post
Post
5452
🧬 Darwin V6: Diagnostic-Guided Evolutionary Model Merging

We are releasing Darwin-31B-Opus — a reasoning-enhanced model merging Google's Gemma-4-31B-it and TeichAI's Claude Opus Distill using the Darwin V6 engine.

Model: FINAL-Bench/Darwin-31B-Opus
Demo: FINAL-Bench/Darwin-31B-Opus

🔬 What Darwin V6 Does

Conventional merging tools (mergekit, etc.) apply a single ratio to all tensors. Set ratio=0.5 and all 1,188 tensors blend identically, with no distinction between which tensors matter for reasoning versus coding.

Darwin V6 diagnoses both parents at the tensor level before merging. It measures Shannon entropy, standard deviation, and L2 norm for every tensor, then passes 5 diagnostic probes (REASONING, CODE, MATH, KNOWLEDGE, LANGUAGE) through the model to determine layer-wise functional importance. Each of the 1,188 tensors receives an independent optimal ratio.

combined = static(entropy/std/norm) x 0.4 + probe(cosine_distance) x 0.6
final_ratio = mri_ratio x mri_trust + genome_ratio x (1 - mri_trust)

When one parent is overwhelmingly superior for a tensor (ratio < 0.15 or > 0.85), Darwin transplants it directly without interpolation. The mri_trust parameter itself is optimized by CMA-ES evolutionary search, so optimal transplant intensity is determined automatically. After merging, a Health Check compares the child against both parents layer-by-layer to detect interference or function loss.

🧬 Parent Models
Father: google/gemma-4-31B-it
Mother: TeichAI/gemma-4-31B-it-Claude-Opus-Distill

🧬 Results
Compared under identical conditions (same 50 questions, same seed, greedy, thinking mode):
Father: 60.0% (30/50)
Darwin-31B-Opus: 66.0% (33/50) — +10% relative improvement
ARC-Challenge: 82.89% (loglikelihood, zero-shot, 200 questions)
Optimal genome found by evolution:
ffn_ratio=0.93 — FFN layers strongly favor Mother (Claude Opus Distill)
block_5 (L50-L59)=0.86 and more...
  • 11 replies
·
Parveshiiii 
posted an update 3 days ago
view post
Post
1550
Excited to announce my latest open-source release on Hugging Face: Parveshiiii/breast-cancer-detector.

This model has been trained and validated on external datasets to support medical research workflows. It is designed to provide reproducible benchmarks and serve as a foundation for further exploration in healthcare AI.

Key highlights:
- Built for medical research and diagnostic study contexts
- Validated against external datasets for reliability
- Openly available to empower the community in building stronger, more effective solutions

This release is part of my ongoing effort to make impactful AI research accessible through **Modotte**. A detailed blog post explaining the methodology, dataset handling, and validation process will be published soon.

You can explore the model here: Parveshiiii/breast-cancer-detector

#AI #MedicalResearch #DeepLearning #Healthcare #OpenSource #HuggingFace

prithivMLmods 
posted an update 2 days ago
view post
Post
1924
Now the demo for image detection based on SAM3 and Gemma-4 (*Filter) is available on Spaces, using full-fledged Transformers inference with multimodal reasoning for processed images. It also supports video segmentation (mask), video segmentation (annotation), and image click segmentation.

🤗 Demo Space: prithivMLmods/SAM3-Gemma4-CUDA
🥽 SAM3: facebook/sam3
🔗 gemma-4-E2B-it: google/gemma-4-E2B-it

To learn more, visit the app page or the respective model pages.
  • 1 reply
·
DavidAU 
posted an update 4 days ago
view post
Post
3337
THREE Gemma 4 , 31B Uncensored Fine Tunes (via Unsloth, inhouse datasets):

Uncensored first, then tuned.
Some benchmarks posted, others pending.
Examples posted, detailed instructions.
Some GGUFs are up; others pending as of this writing.

Enjoy:

DavidAU/gemma-4-31B-it-Mystery-Fine-Tune-HERETIC-UNCENSORED-Thinking
DavidAU/gemma-4-31B-it-Grand-Horror-X-INTENSE-HERETIC-UNCENSORED-Thinking
DavidAU/gemma-4-31B-it-The-DECKARD-HERETIC-UNCENSORED-Thinking

UPDATE:
DavidAU/gemma-4-E4B-it-The-DECKARD-Expresso-Universe-HERETIC-UNCENSORED-Thinking

Exceeds Gemma4 26B-A4B in critical benchmarks.
  • 6 replies
·
SeaWolf-AI 
posted an update about 4 hours ago
view post
Post
27
Why This Matters — David Defeats Goliath

MODEL: FINAL-Bench/Darwin-4B-David
SPACE: FINAL-Bench/Darwin-4B-david

We're releasing Darwin-4B-David, the first second-generation model in the Darwin Opus family. By evolving an already-evolved model, it achieves 85.0% on GPQA Diamond — surpassing its 58.6% original ancestor and even gemma-4-31B (84.3%) — with just 4.5B parameters.

Second-Generation Evolution
Most merges start from a base model and produce a single offspring. Darwin-4B-David breaks this pattern. The Father (Darwin-4B-Opus) was already evolved from gemma-4-E4B-it with Claude Opus reasoning distillation — a Gen-1 model. The Mother (DavidAU's DECKARD-Expresso-Universe) brings Unsloth deep tuning across 5 in-house datasets with thinking mode by default. Crossbreeding these two produced the first Gen-2 Darwin model.

Darwin V6's Model MRI scanned both parents across all 42 layers, assigning independent optimal ratios per layer. The Mother's creativity and Korean language hotspot (Layer 22-25, weight 0.95) was maximally absorbed, while the Father's reasoning core (Layer 30-40, weight 0.48) was preserved. This is "Merge = Evolve" applied recursively — evolution of evolution.

Benchmarks
Darwin-4B-David scores 85.0% on GPQA Diamond (+26.4%p over original 58.6%), evaluated generatively with maj@8 (8 generations per question, majority vote), Epoch AI prompt format, thinking mode enabled, 50 sampled questions. On ARC-Challenge (25-shot, loglikelihood), both score 64.93% — expected, as loglikelihood doesn't capture thinking-mode reasoning differences.

Why This Matters
gemma-4-31B (30.7B) scores 84.3%. Darwin-4B-David surpasses it at 1/7th the size — no training, no RL, just 45 minutes of MRI-guided DARE-TIES on one H100. The name "David" honors Mother creator DavidAU and evokes David vs. Goliath.