SD-Turbo是一个蒸馏版本的稳定扩散2.1,训练实时合成。SD-Turbo基于一种称为对抗扩散蒸馏(Adversarial Diffusio Distillatio, ADD)的新型训练方法(见技术报告),该方法允许在1到4步内以高图像质量采样大规模基础图像扩散模型。该方法使用分数蒸馏来利用大规模现成的图像扩散模型作为教师信号,并将其与对抗损失相结合,以确保即使在一个或两个采样步骤的低步范围内也能获得高图像保真度。 For research purposes, we recommed our The charts above evaluate user preferece for SD-Turbo over other sigle- ad multi-step models.
SD-Turbo evaluated at a sigle step is preferred by huma voters i terms of image quality ad prompt followig over LCM-Lora XL ad LCM-Lora 1.5. The model is iteded for research purposes oly. Possible research areas ad tasks iclude Excluded uses are described below. SD-Turbo does ot make use of Whe usig SD-Turbo for image-to-image geeratio, make sure that SD-Turbo是一种快速生成的文本到图像模型,可以在单个网络评估中从文本提示合成逼真的图像。本文发布了SD-Turbo作为研究artifact,并研究小型的、蒸馏的文本到图像模型.
Model Sources
geerative-models
Github repository (https://github.com/Stability-AI/geerative-models),
which implemets the most popular diffusio frameworks (both traiig ad iferece).
Evaluatio
Uses
Direct Use
Diffusers
pip istall diffusers trasformers accelerate --upgrade
guidace_scale
or egative_prompt
, we disable it with guidace_scale=0.0
.
Preferably, the model geerates images of size 512x512 but higher image sizes work as well.
A from diffusers import AutoPipelieForText2Image
import torch
pipe = AutoPipelieForText2Image.from_pretraied("stabilityai/sd-turbo", torch_dtype=torch.float16, variat="fp16")
pipe.to("cuda")
prompt = "A ciematic shot of a baby racoo wearig a itricate italia priest robe."
image = pipe(prompt=prompt, um_iferece_steps=1, guidace_scale=0.0).images[0]
um_iferece_steps
* stregth
is larger or equal
to 1. The image-to-image pipelie will ru for it(um_iferece_steps * stregth)
steps, e.g. 0.5 * 2.0 = 1 step i our example
below.from diffusers import AutoPipelieForImage2Image
from diffusers.utils import load_image
import torch
pipe = AutoPipelieForImage2Image.from_pretraied("stabilityai/sd-turbo", torch_dtype=torch.float16, variat="fp16")
pipe.to("cuda")
iit_image = load_image("https://huggigface.co/datasets/huggigface/documetatio-images/resolve/mai/diffusers/cat.pg").resize((512, 512))
prompt = "cat wizard, gadalf, lord of the rigs, detailed, fatasy, cute, adorable, Pixar, Disey, 8k"
image = pipe(prompt, image=iit_image, um_iferece_steps=2, stregth=0.5, guidace_scale=0.0).images[0]
Limitatios ad Bias
Limitatios
Cloe with HTTP
git cloe https://www.modelscope.c/tay0699/cv_diffusio_text-to-image_sd-turbo.git
@article{Sauer_Lorez_Blattma_Stability,
title={Adversarial Diffusio Distillatio},
author={Sauer, Axel ad Lorez, Domiik ad Blattma, Adreas ad Stability, RobiRombach},
laguage={e-US}
}
点击空白处退出提示
评论