This is the official release of CotrolNet 1.1. CotrolNet 1.1 has the exactly same architecture with CotrolNet 1.0. We promise that we will ot chage the eural etwork architecture before CotrolNet 1.5 (at least, ad hopefully we will ever chage the etwork architecture). Perhaps this is the best ews i CotrolNet 1.1. CotrolNet 1.1 icludes all previous models with improved robustess ad result quality. Several ew models are added. Note that we are still workig o updatig this to A1111. This repo will be merged to CotrolNet after we make sure that everythig is OK. Please do ot copy the URL of this project ito your A1111. If you wat to use CotrolNet 1.1 i A1111, you oly eed to istall https://github.com/Mikubill/sd-webui-cotrolet If you wat to use CotrolNet 1.1 i A1111, you oly eed to follow the istructios i https://github.com/Mikubill/sd-webui-cotrolet This project is for research use ad academic experimets. Agai, do NOT istall CotrolNet-v1-1-ightly ito your A1111. The Beta Test for A1111 Is Started. The A1111 plugi is: https://github.com/Mikubill/sd-webui-cotrolet The discussio ad bug report is: https://github.com/Mikubill/sd-webui-cotrolet/issues/736 Startig from CotrolNet 1.1, we begi to use the Stadard CotrolNet Namig Rules (SCNNRs) to ame all models. We hope that this amig rule ca improve the user experiece. CotrolNet 1.1 iclude 14 models (11 productio-ready models ad 3 experimetal models): You ca dowload all those models from our HuggigFace Model Page. All these models should be put i the folder "models". You eed to dowload Stable Diffusio 1.5 model "v1-5-prued.ckpt" ad put it i the folder "models". Our pytho codes will automatically dowload other aotator models like HED ad OpePose. Nevertheless, if you wat to maually dowload these, you ca dowload all other aotator models from here. All these models should be put i folder "aotator/ckpts". To istall: Note that if you use 8GB GPU, you eed to set "save_memory = True" i "cofig.py". Cotrol Stable Diffusio with Depth Maps. Model file: cotrolv11f1psd15_depth.pth Cofig file: cotrolv11f1psd15_depth.yaml Traiig data: Midas depth (resolutio 256/384/512) + Leres Depth (resolutio 256/384/512) + Zoe Depth (resolutio 256/384/512). Multiple depth map geerator at multiple resolutio as data augmetatio. Acceptable Preprocessors: DepthMidas, DepthLeres, Depth_Zoe. This model is highly robust ad ca work o real depth map from rederig egies. No-cherry-picked batch test with radom seed 12345 ("a hadsome ma"): 2023/04/14: 72 hours ago we uploaded a wrog model "cotrolv11psd15depth" by mistake. That model is a itermediate checkpoit durig the traiig. That model is ot coverged ad may cause distortio i results. We uploaded the correct depth model as "cotrolv11f1psd15depth". The "f1" meas bug fix 1. The icorrect model is removed. Sorry for the icoveiece. Cotrol Stable Diffusio with Normal Maps. Model file: cotrolv11psd15_ormalbae.pth Cofig file: cotrolv11psd15_ormalbae.yaml Traiig data: Bae's ormalmap estimatio method. Acceptable Preprocessors: Normal BAE. This model ca accept ormal maps from rederig egies as log as the ormal map follows ScaNet's protocol. That is to say, the color of your ormal map should look like the secod colum of this image. Note that this method is much more reasoable tha the ormal-from-midas method i CotrolNet 1.1. The previous method will be abadoed. No-cherry-picked batch test with radom seed 12345 ("a ma made of flowers"): No-cherry-picked batch test with radom seed 12345 ("room"): Cotrol Stable Diffusio with Cay Maps. Model file: cotrolv11psd15_cay.pth Cofig file: cotrolv11psd15_cay.yaml Traiig data: Cay with radom thresholds. Acceptable Preprocessors: Cay. We fixed several problems i previous traiig datasets. No-cherry-picked batch test with radom seed 12345 ("dog i a room"): Cotrol Stable Diffusio with M-LSD straight lies. Model file: cotrolv11psd15_mlsd.pth Cofig file: cotrolv11psd15_mlsd.yaml Traiig data: M-LSD Lies. Acceptable Preprocessors: MLSD. We fixed several problems i previous traiig datasets. The model is resumed from CotrolNet 1.0 ad traied with 200 GPU hours of A100 80G. No-cherry-picked batch test with radom seed 12345 ("room"): Cotrol Stable Diffusio with Scribbles. Model file: cotrolv11psd15_scribble.pth Cofig file: cotrolv11psd15_scribble.yaml Traiig data: Sythesized scribbles. Acceptable Preprocessors: Sythesized scribbles (ScribbleHED, ScribblePIDI, etc.) or had-draw scribbles. We fixed several problems i previous traiig datasets. The model is resumed from CotrolNet 1.0 ad traied with 200 GPU hours of A100 80G. No-cherry-picked batch test with radom seed 12345 ("ma i library"): No-cherry-picked batch test with radom seed 12345 (iteractive, "the beautiful ladscape"): Cotrol Stable Diffusio with Soft Edges. Model file: cotrolv11psd15_softedge.pth Cofig file: cotrolv11psd15_softedge.yaml Traiig data: SoftEdgePIDI, SoftEdgePIDIsafe, SoftEdgeHED, SoftEdgeHEDsafe. Acceptable Preprocessors: SoftEdgePIDI, SoftEdgePIDIsafe, SoftEdgeHED, SoftEdgeHEDsafe. This model is sigificatly improved compared to previous model. All users should update as soo as possible. New i CotrolNet 1.1: ow we added a ew type of soft edge called "SoftEdge_safe". This is motivated by the fact that HED or PIDI teds to hide a corrupted greyscale versio of the origial image iside the soft estimatio, ad such hidde patters ca distract CotrolNet, leadig to bad results. The solutio is to use a pre-processig to quatize the edge maps ito several levels so that the hidde patters ca be completely removed. The implemetatio is i the 78-th lie of aotator/util.py. The perforamce ca be roughly oted as: Robustess: SoftEdgePIDIsafe > SoftEdgeHEDsafe >> SoftEdgePIDI > SoftEdgeHED Maximum result quality: SoftEdgeHED > SoftEdgePIDI > SoftEdgeHEDsafe > SoftEdgePIDIsafe Cosiderig the trade-off, we recommed to use SoftEdge_PIDI by default. I most cases it works very well. No-cherry-picked batch test with radom seed 12345 ("a hadsome ma"): Cotrol Stable Diffusio with Sematic Segmetatio. Model file: cotrolv11psd15_seg.pth Cofig file: cotrolv11psd15_seg.yaml Traiig data: COCO + ADE20K. Acceptable Preprocessors: SegOFADE20K (Oeformer ADE20K), SegOFCOCO (Oeformer COCO), Seg_UFADE20K (Uiformer ADE20K), or maually created masks. Now the model ca receive both type of ADE20K or COCO aotatios. We fid that recogizig the segmetatio protocol is trivial for the CotrolNet ecoder ad traiig the model of multiple segmetatio protocols lead to better performace. No-cherry-picked batch test with radom seed 12345 (ADE20k protocol, "house"): No-cherry-picked batch test with radom seed 12345 (COCO protocol, "house"): Cotrol Stable Diffusio with Opepose. Model file: cotrolv11psd15_opepose.pth Cofig file: cotrolv11psd15_opepose.yaml The model is traied ad ca accept the followig combiatios: However, providig all those combiatios is too complicated. We recommed to provide the users with oly two choices: You ca try with the demo: No-cherry-picked batch test with radom seed 12345 ("ma i suit"): No-cherry-picked batch test with radom seed 12345 (multiple people i the wild, "hadsome boys i the party"): Cotrol Stable Diffusio with Liearts. Model file: cotrolv11psd15_lieart.pth Cofig file: cotrolv11psd15_lieart.yaml This model is traied o awacke1/Image-to-Lie-Drawigs. The preprocessor ca geerate detailed or coarse liearts from images (Lieart ad Lieart_Coarse). The model is traied with sufficiet data augmetatio ad ca receive maually draw liearts. No-cherry-picked batch test with radom seed 12345 (detailed lieart extractor, "bag"): No-cherry-picked batch test with radom seed 12345 (coarse lieart extractor, "Michael Jackso's cocert"): No-cherry-picked batch test with radom seed 12345 (use maually draw liearts, "wolf"): Cotrol Stable Diffusio with Aime Liearts. Model file: cotrolv11psd15s2lieartaime.pth Cofig file: cotrolv11psd15s2lieartaime.yaml Traiig data ad implemetatio details: (descriptio removed). This model ca take real aime lie drawigs or extracted lie drawigs as iputs. Some importat otice: Demo: No-cherry-picked batch test with radom seed 12345 ("1girl, i classroom, skirt, uiform, red hair, bag, gree eyes"): No-cherry-picked batch test with radom seed 12345 ("1girl, saber, at ight, sword, gree eyes, golde hair, stockig"): No-cherry-picked batch test with radom seed 12345 (extracted lie drawig, "1girl, Castle, silver hair, dress, Gemstoe, ciematic lightig, mechaical had, 4k, 8k, extremely detailed, Gothic, gree eye"): Cotrol Stable Diffusio with Cotet Shuffle. Model file: cotrolv11esd15_shuffle.pth Cofig file: cotrolv11esd15_shuffle.yaml Demo: The model is traied to reorgaize images. We use a radom flow to shuffle the image ad cotrol Stable Diffusio to recompose the image. No-cherry-picked batch test with radom seed 12345 ("hog kog"): I the 6 images o the right, the left-top oe is the "shuffled" image. All others are outputs. I fact, sice the CotrolNet is traied to recompose images, we do ot eve eed to shuffle the iput - sometimes we ca just use the origial image as iput. I this way, this CotrolNet ca be guided by prompts or other CotrolNets to chage the image style. Note that this method has othig to do with CLIP visio or some other models. This is a pure CotrolNet. No-cherry-picked batch test with radom seed 12345 ("iro ma"): No-cherry-picked batch test with radom seed 12345 ("spider ma"): Note that this CotrolNet requires to add a global average poolig " x = torch.mea(x, dim=(2, 3), keepdim=True) " betwee the CotrolNet Ecoder outputs ad SD Uet layers. Ad the CotrolNet must be put oly o the coditioal side of cfg scale. We recommed to use the "globalaveragepoolig" item i the yaml file to cotrol such behaviors. Note that this CotrolNet Shuffle will be the oe ad oly oe image stylizatio method that we will maitai for the robustess i a log term support. We have tested other CLIP image ecoder, Uclip, image tokeizatio, ad image-based prompts but it seems that those methods do ot work very well with user prompts or additioal/multiple U-Net ijectios. See also the evidece here, here, ad some other related issues. Cotrol Stable Diffusio with Istruct Pix2Pix. Model file: cotrolv11esd15_ip2p.pth Cofig file: cotrolv11esd15_ip2p.yaml Demo: This is a cotrolet traied o the Istruct Pix2Pix dataset. Differet from official Istruct Pix2Pix, this model is traied with 50\% istructio prompts ad 50\% descriptio prompts. For example, "a cute boy" is a descriptio prompt, while "make the boy cute" is a istructio prompt. Because this is a CotrolNet, you do ot eed to trouble with origial IP2P's double cfg tuig. Ad, this model ca be applied to ay base model. Also, it seems that istructios like "make it ito X" works better tha "make Y ito X". No-cherry-picked batch test with radom seed 12345 ("make it o fire"): No-cherry-picked batch test with radom seed 12345 ("make it witer"): We mark this model as "experimetal" because it sometimes eeds cherry-pickig. For example, here is o-cherry-picked batch test with radom seed 12345 ("make he iro ma"): Cotrol Stable Diffusio with Ipait. Model file: cotrolv11psd15_ipait.pth Cofig file: cotrolv11psd15_ipait.yaml Demo: Some otices: Update 2023/May/03: CotrolNet's ipait without chagig umasked areas is implemeted i a1111. It supports arbitrary base models/LoRAa, ad ca work together with arbitrary umber of other CotrolNets. No-cherry-picked batch test with radom seed 12345 ("a hadsome ma"): Update 2023 April 25: The previously ufiished tile model is fiished ow. The ew ame is "cotrolv11f1esd15tile". The "f1e" meas 1st bug fix ("f1"), experimetal ("e"). The previous "cotrolv11usd15tile" is removed. Please update if your model ame is "v11u". Cotrol Stable Diffusio with Tiles. Model file: cotrolv11f1esd15_tile.pth Cofig file: cotrolv11f1esd15_tile.yaml Demo: The model ca be used i may ways. Overall, the model has two behaviors: Because the model ca geerate ew details ad igore existig image details, we ca use this model to remove bad details ad add refied details. For example, remove blurrig caused by image resizig. Below is a example of 8x super resolutio. This is a 64x64 dog image. No-cherry-picked batch test with radom seed 12345 ("dog o grasslad"): Note that this model is ot a super resolutio model. It igores the details i a image ad geerate ew details. This meas you ca use it to fix bad details i a image. For example, below is a dog image corrupted by Real-ESRGAN. This is a typical example that sometimes super resolutio methds fail to upscale images whe source cotext is too small. No-cherry-picked batch test with radom seed 12345 ("dog o grasslad"): If your image already have good details, you ca still use this model to replace image details. Note that Stable Diffusio's I2I ca achieve similar effects but this model make it much easier for you to maitai the overall structure ad oly chage details eve with deoisig stregth 1.0 . No-cherry-picked batch test with radom seed 12345 ("Silver Armor"): More ad more people begi to thik about differet methods to diffuse at tiles so that images ca be very big (at 4k or 8k). The problem is that, i Stable Diffusio, your prompts will always ifluet each tile. For example, if your prompts are "a beautiful girl" ad you split a image ito 4×4=16 blocks ad do diffusio i each block, the you are will get 16 "beautiful girls" rather tha "a beautiful girl". This is a well-kow problem. Right ow people's solutio is to use some meaigless prompts like "clear, clear, super clear" to diffuse blocks. But you ca expect that the results will be bad if the deoisig stregth is high. Ad because the prompts are bad, the cotets are pretty radom. CotrolNet Tile ca solve this problem. For a give tile, it recogizes what is iside the tile ad icrease the ifluece of that recogized sematics, ad it also decreases the ifluece of global prompts if cotets do ot match. No-cherry-picked batch test with radom seed 12345 ("a hadsome ma"): You ca see that the prompt is "a hadsome ma" but the model does ot pait "a hadsome ma" o that tree leaves. Istead, it recogizes the tree leaves pait accordigly. I this way, CotrolNet is able to chage the behavior of ay Stable Diffusio model to perform diffusio i tiles. We provide simple pytho scripts to process images.CotrolNet 1.1
This Project is NOT a A1111 extesio
How to use CotrolNet 1.1 i A1111?
Model Specificatio
cotrol_v11p_sd15_cay
cotrol_v11p_sd15_mlsd
cotrol_v11f1p_sd15_depth
cotrol_v11p_sd15_ormalbae
cotrol_v11p_sd15_seg
cotrol_v11p_sd15_ipait
cotrol_v11p_sd15_lieart
cotrol_v11p_sd15s2_lieart_aime
cotrol_v11p_sd15_opepose
cotrol_v11p_sd15_scribble
cotrol_v11p_sd15_softedge
cotrol_v11e_sd15_shuffle
cotrol_v11e_sd15_ip2p
cotrol_v11f1e_sd15_tile
coda ev create -f eviromet.yaml
coda activate cotrol-v11
CotrolNet 1.1 Depth
pytho gradio_depth.py
CotrolNet 1.1 Normal
pytho gradio_ormalbae.py
CotrolNet 1.1 Cay
pytho gradio_cay.py
CotrolNet 1.1 MLSD
pytho gradio_mlsd.py
CotrolNet 1.1 Scribble
# To test sythesized scribbles
pytho gradio_scribble.py
# To test had-draw scribbles i a iteractive demo
pytho gradio_iteractive.py
CotrolNet 1.1 Soft Edge
pytho gradio_softedge.py
CotrolNet 1.1 Segmetatio
pytho gradio_seg.py
CotrolNet 1.1 Opepose
pytho gradio_opepose.py
CotrolNet 1.1 Lieart
pytho gradio_lieart.py
CotrolNet 1.1 Aime Lieart
pytho gradio_lieart_aime.py
CotrolNet 1.1 Shuffle
pytho gradio_shuffle.py
CotrolNet 1.1 Istruct Pix2Pix
pytho gradio_ip2p.py
CotrolNet 1.1 Ipait
pytho gradio_ipait.py
CotrolNet 1.1 Tile
pytho gradio_tile.py
Aotate Your Ow Data
点击空白处退出提示
评论