sungnyun
/

diffblender

Model card Files Files and versions

diffblender / README.md

sungnyun's picture

Update README.md

0def3a7 over 1 year ago

|

history blame contribute delete

1.16 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: transformers
	pipeline_tag: text-to-image
	---

	<br>

	# DiffBlender Model Card

	This repo contains the models from our paper [DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models](https://arxiv.org/abs/2305.15194).


	## Model details

	Model type:
	DiffBlender successfully synthesizes complex combinations of input modalities.
	It enables flexible manipulation of conditions, providing the customized generation aligned with user preferences.
	We designed its structure to intuitively extend to additional modalities while achieving a low training cost through a partial update of hypernetworks.

	We provide its model checkpoint, trained with six modalities: sketch, depth map, grounding box, keypoints, color palette, and style embedding. >> `./checkpoint_latest.pth`

	License:
	Apache 2.0 License

	Where to send questions or comments about the model:
	https://github.com/sungnyun/diffblender/issues


	## Training dataset
	[Microsoft COCO 2017 dataset](https://cocodataset.org/#home)


	<br>

	More detials are in our project page, https://sungnyun.github.io/diffblender/.