qianyu1217
/

diffdis

Image Segmentation

DiffDISPipeline

Model card Files Files and versions Community

qianyu1217 commited on Jul 18

Commit

9cba551

·

verified ·

1 Parent(s): 6e18d1a

Update README.md

Files changed (1) hide show

README.md +77 -3

README.md CHANGED Viewed

@@ -1,3 +1,77 @@
----
-license: mit
----

+---
+license: mit
+---
+<h1 align="center">High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity</h1>
+<div align="center" style="display: flex; justify-content: center; flex-wrap: wrap;">
+  <a href='https://arxiv.org/pdf/2410.10105'><img src='https://img.shields.io/badge/arXiv-DiffDIS-B31B1B'></a>&ensp;
+  <a href='https://github.com/qianyu-dlut/DiffDIS'><img src='https://img.shields.io/badge/Github-DiffDIS-blue'></a>&ensp;
+  <a href='LICENSE'><img src='https://img.shields.io/badge/License-MIT-yellow'></a>&ensp;
+</div>
+This repository contains the official implementation for the paper "[High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity](https://arxiv.org/pdf/2410.10105)" (ICLR 2025).
+<p align="center">
+    <img alt="DiffDIS teaser image" src="https://raw.githubusercontent.com/qianyu-dlut/DiffDIS/master/assets/image.png" width="900px">
+</p>
+## How to use
+>  For the complete training and inference process, please refer to our [GitHub Repository](https://github.com/qianyu-dlut/DiffDIS). This section specifically guides you on loading weights from Hugging Face.
+### Install Packages:
+```shell
+pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu118
+pip install -r requirements.txt
+pip install -e diffusers-0.30.2/
+```
+### Load DiffDIS weights from Hugging Face:
+```python
+# Use codes and weights locally
+import torch
+from diffusers import (
+    DiffusionPipeline,
+    DDPMScheduler,
+    UNet2DConditionModel,
+    AutoencoderKL,
+)
+from transformers import CLIPTextModel, CLIPTokenizer
+hf_model_path = 'qianyu1217/diffdis'
+vae = AutoencoderKL.from_pretrained(hf_model_path,subfolder='vae',trust_remote_code=True)
+scheduler = DDPMScheduler.from_pretrained(hf_model_path,subfolder='scheduler')
+text_encoder = CLIPTextModel.from_pretrained(hf_model_path,subfolder='text_encoder')
+tokenizer = CLIPTokenizer.from_pretrained(hf_model_path,subfolder='tokenizer')
+unet = UNet2DConditionModel_diffdis.from_pretrained(hf_model_path,subfolder="unet",
+                                in_channels=8, sample_size=96,
+                                low_cpu_mem_usage=False,
+                                ignore_mismatched_sizes=False,
+                                class_embed_type='projection',
+                                projection_class_embeddings_input_dim=4,
+                                mid_extra_cross=True,
+                                mode = 'DBIA',
+                                use_swci = True)
+pipe = DiffDISPipeline(unet=unet,
+                        vae=vae,
+                        scheduler=scheduler,
+                        text_encoder=text_encoder,
+                        tokenizer=tokenizer)
+```
+## Citation
+```
+@article{DiffDIS,
+  title={High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity},
+  author={Yu, Qian and Jiang, Peng-Tao and Zhang, Hao and Chen, Jinwei and Li, Bo and Zhang, Lihe and Lu, Huchuan},
+  journal={arXiv preprint arXiv:2410.10105},
+  year={2024}
+}
+```