模型描述 (Model Description)
这是布局生成方法LayoutDM的官方实现。
运行环境 (Operating environment)
pip install modelscope https://data.pyg.org/whl/torch-2.0.0%2Bcu118/torch_sparse-0.6.17%2Bpt20cu118-cp38-cp38-linux_x86_64.whl prdc==0.2 pytorch-fid==0.2.1
使用范围和应用场景
背景无关、从空白画布生成多种元素的布局,例如UI界面、PPT制作和杂志文章排版等;
如何使用
代码范例
from modelscope.pipelines import pipeline
# run pipeline
input = {
'n_samples': 4, # generation n_samples after inference
}
inference = pipeline('layout-generation', model='chenhyer/LayoutDM_layout_generation', model_revision='v1.6.0')
pred = inference(input)
# visualization for PubLayNet model
from trainer.helpers.visualization import save_image
import seaborn as sns
SIZE = (360, 240)
labels = [
"text",
"title",
"list",
"table",
"figure",
]
def colors(num_classes):
n_colors = num_classes
colors = sns.color_palette("husl", n_colors=n_colors)
colors = [tuple(map(lambda x: int(x * 255), c)) for c in colors]
return colors
save_kwargs = {
"colors": colors(num_classes=len(labels)), "names": labels,
"canvas_size": SIZE, "use_grid": True,
"draw_label": True, # Whether to display the category name of each box in the resulting image, such as text, table, etc
}
save_kwargs['out_path'] = 'pred_ucond_ms.png' # visualize unconditional generation result
save_image(pred["bbox"], pred["label"], pred["mask"], **save_kwargs)
模型局限性以及可能的偏差
- 只有坐标信息,不包含背景图图像特征,不能根据背景信息生成不遮挡主体的布局
- 无法实现文字颜色等属性的预测
- 输入输出都是box,如果商品细长斜放在图像对角线,可能无法感知商品主体的mask区域;
模型效果
Citation
如果您发现此工作对您的研究有帮助,请考虑引用以下BibTeX条目。
@inproceedings{inoue2023layout,
title={LayoutDM: Discrete Diffusion Model for Controllable Layout Generation},
author={Naoto Inoue and Kotaro Kikuchi and Edgar Simo-Serra and Mayu Otani and Kota Yamaguchi},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2023},
pages={10167-10176},
}
评论