功能概述
输入一张透明背景的主体图,输入一张参考图,模型根据参考图的语义在透明区域生成合适的背景
模型结构
基于开源SD模型,修改生成引导条件,并在开源数据集laion-5B的部分数据上训练而来,模型结构如下:
环境准备
安装独立repo库
pip install git+https://github.com/lllcho/background_generation.git
或者网络较慢时,使用如下命令安装:
pip install git+https://gitee.com/lllcho/background_generation.git
运行代码
from modelscope.pipelines import pipeline
from modelscope.outputs import OutputKeys
from PIL import Image
from background_generation import modelscope_warpper
model = "damo/cv_background_generation_sd"
pipe = pipeline('background_generation_task', model=model, device='gpu',auto_collate=False,model_revision='v1.1.0')
main_image='https://vision-poster.oss-cn-shanghai.aliyuncs.com/lllcho.lc/data/test_data/demo_example/%E5%8C%96%E5%A6%86%E5%93%81/1c33fc5e8b084269ffdb4e0557c2c3c4.png'
reference_image='https://vision-poster.oss-cn-shanghai.aliyuncs.com/lllcho.lc/data/test_data/5d873b5f64b82bcbb235748347602dce38c6ec1d.jpg'
out=pipe(main_image,reference_image,num_images_per_prompt=1)
imgs=out[OutputKeys.OUTPUT_IMGS]
imgs[0].save(f'result.jpg')
参数说明
pipeline调用时还支持以下可调参数:
num_inference_steps
: int, 默认为20num_images_per_prompt
:默认为1,每次调用返回几张图,可根据显存大小调整seed
:默认为None,int类型,取值范围[0, 2^32-1]noise_level
: int,默认值为0, 取值范围[0,999],表示像输入图像中加入噪声,值越大噪声越多,生成结果与输入图像的相似度越低
完整参数调用示例:
from modelscope.pipelines import pipeline
from modelscope.outputs import OutputKeys
from PIL import Image
from background_generation import modelscope_warpper
model = "damo/cv_background_generation_sd"
pipe = pipeline('background_generation_task', model=model, device='gpu',auto_collate=False,model_revision='v1.1.0')
out=pipe('https://vision-poster.oss-cn-shanghai.aliyuncs.com/lllcho.lc/data/test_data/demo_example/%E5%8C%96%E5%A6%86%E5%93%81/1c33fc5e8b084269ffdb4e0557c2c3c4.png',
'https://vision-poster.oss-cn-shanghai.aliyuncs.com/lllcho.lc/data/test_data/5d873b5f64b82bcbb235748347602dce38c6ec1d.jpg',
num_inference_steps=20,
num_images_per_prompt=2,
seed=None,
noise_level=500
)
评论