| --- |
| license: creativeml-openrail-m |
| tags: |
| - coreml |
| - stable-diffusion |
| - text-to-image |
| --- |
| # ControlNet v1.1 Models And Compatible Stable Diffusion v1.5 Type Models Converted To Apple CoreML Format |
|
|
| ## For use with a Swift app or the SwiftCLI |
|
|
| The SD models are all "Original" (not "Split-Einsum") and built for CPU and GPU. They are each for the output size noted. They are fp16, with the standard SD-1.5 VAE embedded. |
|
|
| The Stable Diffusion v1.5 model and the other SD 1.5 type models contain both the standard Unet and the ControlledUnet used for a ControlNet pipeline. The correct one will be used automatically based on whether a ControlNet is enabled or not. |
|
|
| They have VAEEncoder.mlmodelc bundles that allow Image2Image to operate correctly at the noted resolutions, when used with a current Swift CLI pipeline or a current GUI built with ml-stable-diffusion 0.4.0, such as Mochi Diffusion 3.2 or later. |
|
|
| All of the ControlNet models are "Original" ones, built for CPU and GPU compute units (cpuAndGPU) and for SD-1.5 type models. They will not work with SD-2.1 type models. The zip files each have a set of models at 4 resolutions. The 512x512 builds appear to also work with "Split-Einsum" models, using CPU and GPU (cpuAmdGPU), but from my tests, they will not work with "Split-Einsum" models when using the Neural Engine (NE). |
|
|
| All of the models in this repo work with Swift and the current apple/ml-stable-diffusion pipeline release (0.4.0). They were not built for, and will not work with, a Python Diffusers pipeline. They need ml-stable-diffusion (https://github.com/apple/ml-stable-diffusion) for command line use, or a Swift app that supports ControlNet, such as the Mochi Diffusion (https://github.com/godly-devotion/MochiDiffusion) test version currently in a closed beta test. Join the Mochi Difusion Discord server (https://discord.gg/x2kartzxGv) to request access to the beta test version. |
|
|
| The full SD models are in the "SD" folder of this repo. They are in subfolders by model name and individually zipped for a particular resolution. They need to be unzipped for use after downloading. |
|
|
| The ControlNet model files are in the "CN" folder of this repo. They are zipped and need to be unzipped after downloading. Each zip holds a set of 4 resolutions for that ControlNet type, built for 512x512, 512x768, 768x512 and 768x768. |
|
|
| There is also a "MISC" folder that has text files with some notes and a screencap of my directory structure. These are provided for those who want to convert models themselves and/or run the models with a SwiftCLI. The notes are not perfect, and may be out of date if any of the Python or CoreML packages referenced have been updated recently. You can open a Discussion here if you need help with any of the "MISC" items. |
|
|
| For command line use, the "MISC" notes cover setting up a miniconda3 environment. If you are using the command line, please read the notes concerning naming and placement of your ControlNet model folder. |
|
|
| If you are using a GUI, that app will most likely guide you to the correct location/arrangement for your ConrolNet model folder. |
|
|
| The sizes noted for all model type inputs/outputs are WIDTH x HEIGHT. A 512x768 is "portrait" orientation and a 768x512 is "landscape" orientation. |
|
|
| **If you encounter any models that do not work correctly with image2image and/or a ControlNet, using the current apple/ml-stable-diffusion SwiftCLI pipeline for i2i or CN, or Mochi Diffusion 3.2 using i2i, or the Mochi Diffusion beta test build using i2i or CN, please leave a report in the Community Discussion area. If you would like to add models that you have converted, leave a message there as well, and I'll grant you access to this repo.** |
| |
| ## Base Models - A Variety Of SD-1.5-Type Models For Use With ControlNet |
| Each folder contains 4 zipped model files, output sizes as indicated: 512x512, 512x768, 768x512 or 768x768 |
| - DreamShaper v5.0, 1.5-type model, "Original" |
| - GhostMix v1.1, 1.5-type anime model, "Original" |
| - MeinaMix v9.0 1.5-type anime model, "Original" |
| - MyMerge v1.0 1.5-type NSFW model, "Original" |
| - Realistic Vision v2.0, 1.5-type model, "Original" |
| - Stable Diffusion v1.5, "Original" |
|
|
| ## ControlNet Models - All Current SD-1.5-Type ControlNet Models |
| Each zip file contains a set of 4 resolutions: 512x512, 512x768, 768x512 and 768x768 |
| - Canny -- Edge Detection, Outlines As Input |
| - Depth -- Reproduces Depth Relationships From An Image |
| - InPaint -- Use Masks To Define And Modify An Area (not sure how this works) |
| - InstrP2P -- Instruct Pixel2Pixel - "Change X to Y" |
| - LineAnime -- Find And Reuse Small Outlines, Optimized For Anime |
| - LineArt -- Find And Reuse Small Outlines |
| - MLSD -- Find And Reuse Straight Lines And Edges |
| - NormalBAE -- Reproduce Depth Relationships Using Surface Normal Depth Maps |
| - OpenPose -- Copy Body Poses |
| - Scribble -- Freehand Sketch As Input |
| - Segmentation -- Find And Reuse Distinct Areas |
| - Shuffle -- Find And Reorder Major Elements |
| - SoftEdge -- Find And Reuse Soft Edges |
| - Tile -- Subtle Variations Within Batch Runs |
|
|