LayoutDM布局生成-通用领域

我要开发同款
匿名用户2024年07月31日
33阅读
所属分类ai、layoutDM、pytorch、layout generation、cv
开源地址https://modelscope.cn/models/chenhyer/LayoutDM_layout_generation
授权协议Apache License 2.0

作品详情

模型描述 (Model Description)

这是布局生成方法LayoutDM的官方实现。


Top: LayoutDM is trained to gradually generate a complete layout from a blank state in discrete state space. Bottom: During sampling, we can steer LayoutDM to perform various conditional generation tasks without additional training or external models.

运行环境 (Operating environment)

pip install modelscope https://data.pyg.org/whl/torch-2.0.0%2Bcu118/torch_sparse-0.6.17%2Bpt20cu118-cp38-cp38-linux_x86_64.whl prdc==0.2 pytorch-fid==0.2.1

使用范围和应用场景

背景无关、从空白画布生成多种元素的布局,例如UI界面、PPT制作和杂志文章排版等;

如何使用

代码范例

from modelscope.pipelines import pipeline

# run pipeline
input = {
    'n_samples': 4,  # generation n_samples after inference
}
inference = pipeline('layout-generation', model='chenhyer/LayoutDM_layout_generation', model_revision='v1.6.0')
pred = inference(input)


# visualization for PubLayNet model
from trainer.helpers.visualization import save_image
import seaborn as sns

SIZE = (360, 240)
labels = [
    "text",
    "title",
    "list",
    "table",
    "figure",
]
def colors(num_classes):
    n_colors = num_classes
    colors = sns.color_palette("husl", n_colors=n_colors)
    colors = [tuple(map(lambda x: int(x * 255), c)) for c in colors]
    return colors

save_kwargs = {
    "colors": colors(num_classes=len(labels)), "names": labels,
    "canvas_size": SIZE, "use_grid": True,
    "draw_label": True,  # Whether to display the category name of each box in the resulting image, such as text, table, etc
}
save_kwargs['out_path'] = 'pred_ucond_ms.png'  # visualize unconditional generation result
save_image(pred["bbox"], pred["label"], pred["mask"], **save_kwargs)

模型局限性以及可能的偏差

  • 只有坐标信息,不包含背景图图像特征,不能根据背景信息生成不遮挡主体的布局
  • 无法实现文字颜色等属性的预测
  • 输入输出都是box,如果商品细长斜放在图像对角线,可能无法感知商品主体的mask区域;

模型效果


Unconditional Generation Result of PubLayNet

Citation

如果您发现此工作对您的研究有帮助,请考虑引用以下BibTeX条目。

@inproceedings{inoue2023layout,
  title={LayoutDM: Discrete Diffusion Model for Controllable Layout Generation},
  author={Naoto Inoue and Kotaro Kikuchi and Edgar Simo-Serra and Mayu Otani and Kota Yamaguchi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2023},
  pages={10167-10176},
}
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论