Stable Diffusion Dreambooth Concepts Library

community

AI & ML interests

None defined yet.

Recent Activity

sd-dreambooth-library's activity

ameerazam08 
posted an update 3 months ago
DamarJati 
posted an update 4 months ago
view post
Post
3555
Happy New Year 2025 🤗
For the Huggingface community.
DamarJati 
posted an update 9 months ago
view post
Post
4925
Improved ControlNet!
Now supports dynamic resolution for perfect landscape and portrait outputs. Generate stunning images without distortion—optimized for any aspect ratio!
...
DamarJati/FLUX.1-DEV-Canny
multimodalart 
posted an update 10 months ago
multimodalart 
posted an update 12 months ago
view post
Post
28314
The first open Stable Diffusion 3-like architecture model is JUST out 💣 - but it is not SD3! 🤔

It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model 🖼️✨, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english 🤝 chinese understanding

Try it out by yourself here ▶️ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)

In the paper they claim to be SOTA open source based on human preference evaluation!
ameerazam08 
posted an update about 1 year ago
view post
Post
5749
Explore the Latest Top Papers with Papers Leaderboard!
We are excited to introduce a new way to explore the most impactful research papers: Papers Leaderboard! This feature allows you to easily find the most talked-about papers across a variety of fields.
Hf-demo : ameerazam08/Paper-LeaderBoard
Happy weekends!
multimodalart 
posted an update about 1 year ago
view post
Post
The Stable Diffusion 3 research paper broken down, including some overlooked details! 📝

Model
📏 2 base model variants mentioned: 2B and 8B sizes

📐 New architecture in all abstraction levels:
- 🔽 UNet; ⬆️ Multimodal Diffusion Transformer, bye cross attention 👋
- 🆕 Rectified flows for the diffusion process
- 🧩 Still a Latent Diffusion Model

📄 3 text-encoders: 2 CLIPs, one T5-XXL; plug-and-play: removing the larger one maintains competitiveness

🗃️ Dataset was deduplicated with SSCD which helped with memorization (no more details about the dataset tho)

Variants
🔁 A DPO fine-tuned model showed great improvement in prompt understanding and aesthetics
✏️ An Instruct Edit 2B model was trained, and learned how to do text-replacement

Results
✅ State of the art in automated evals for composition and prompt understanding
✅ Best win rate in human preference evaluation for prompt understanding, aesthetics and typography (missing some details on how many participants and the design of the experiment)

Paper: https://stabilityai-public-packages.s3.us-west-2.amazonaws.com/Stable+Diffusion+3+Paper.pdf
·
multimodalart 
posted an update about 1 year ago