模型描述 (Model Description)

这是布局生成方法LayoutDM的官方实现。

Top: LayoutDM is trained to gradually generate a complete layout from a blank state in discrete state space. Bottom: During sampling, we can steer LayoutDM to perform various conditional generation tasks without additional training or external models.

项目主页
paper

运行环境 (Operating environment)

pip install modelscope https://data.pyg.org/whl/torch-2.0.0%2Bcu118/torch_sparse-0.6.17%2Bpt20cu118-cp38-cp38-linux_x86_64.whl prdc==0.2 pytorch-fid==0.2.1

使用范围和应用场景

背景无关、从空白画布生成多种元素的布局，例如UI界面、PPT制作和杂志文章排版等；

如何使用

代码范例

from modelscope.pipelines import pipeline

# run pipeline
input = {
    'n_samples': 4,  # generation n_samples after inference
}
inference = pipeline('layout-generation', model='chenhyer/LayoutDM_layout_generation', model_revision='v1.6.0')
pred = inference(input)


# visualization for PubLayNet model
from trainer.helpers.visualization import save_image
import seaborn as sns

SIZE = (360, 240)
labels = [
    "text",
    "title",
    "list",
    "table",
    "figure",
]
def colors(num_classes):
    n_colors = num_classes
    colors = sns.color_palette("husl", n_colors=n_colors)
    colors = [tuple(map(lambda x: int(x * 255), c)) for c in colors]
    return colors

save_kwargs = {
    "colors": colors(num_classes=len(labels)), "names": labels,
    "canvas_size": SIZE, "use_grid": True,
    "draw_label": True,  # Whether to display the category name of each box in the resulting image, such as text, table, etc
}
save_kwargs['out_path'] = 'pred_ucond_ms.png'  # visualize unconditional generation result
save_image(pred["bbox"], pred["label"], pred["mask"], **save_kwargs)

模型局限性以及可能的偏差

只有坐标信息，不包含背景图图像特征，不能根据背景信息生成不遮挡主体的布局
无法实现文字颜色等属性的预测
输入输出都是box，如果商品细长斜放在图像对角线，可能无法感知商品主体的mask区域；

模型效果

Unconditional Generation Result of PubLayNet

Citation

如果您发现此工作对您的研究有帮助，请考虑引用以下BibTeX条目。

@inproceedings{inoue2023layout,
  title={LayoutDM: Discrete Diffusion Model for Controllable Layout Generation},
  author={Naoto Inoue and Kotaro Kikuchi and Edgar Simo-Serra and Mayu Otani and Kota Yamaguchi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2023},
  pages={10167-10176},
}

LayoutDM布局生成-通用领域

作品详情