Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering
We introduce Glyph-ByT5-v2, a customized text encoder for accurate multilingual visual text rendering and improved aesthetics. As an extension of Glyph-SDXL, our multilingual version supports visual text rendering for up to 10 different languages: English, Chinese, Japanese, Korean, French, German, Spanish, Italian, Portuguese and Russian. Combined with SDXL, our proposed Glyph-SDXL-v2 achieves accurate multilingual design image visual text rendering.
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering
Zeyu Liu, Weicong Liang, Yiming Zhao, Bohan Chen, Ji Li, Yuhui Yuan
Microsoft Research Asia; Tsinghua University; Peking University; University of Liverpool
Preprint
Model Sources
- Repository: [https://github.com/AIGText/Glyph-ByT5]
- Paper: [https://arxiv.org/abs/2406.10208]
- Project Page: [https://glyph-byt5-v2.github.io/]
Model Description
Please check our paper and project page for more details. Detail usage and inference code can be found here.
Visualization
Quick Usage
python inference_v2.py configs/glyph_sdxl_v2_albedo.py checkpoints examples/xiaoman.json --out_folder work_dirs/xiaoman --device cuda --sampler dpm
More Configurations
We list some more useful configurations for easy usage:
Argument/Config | Place | Default | Description |
---|---|---|---|
cfg | argument | 5.0 | Classifier-free guidance |
sampler | argument | dpm | Sampler, provide support for dpm (DPM++ 2M Karras) and euler (EulerDiscreteScheduler) |
pretrainedmodelnameorpath | config | stablediffusionapi/albedobase-xl-20 | Base model |
seed | annotation | None | Seed for inference |
Citation
If you find our work useful in your research, please consider citing:
@misc{liu2024glyphbyt5v2,
title={Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering},
author={Zeyu Liu and Weicong Liang and Yiming Zhao and Bohan Chen and Ji Li and Yuhui Yuan},
year={2024},
eprint={2406.10208},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
and
@misc{liu2024glyphbyt5,
title={Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering},
author={Zeyu Liu and Weicong Liang and Zhanhao Liang and Chong Luo and Ji Li and Gao Huang and Yuhui Yuan},
year={2024},
eprint={2403.09622},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
评论