glm-10b-chinese

我要开发同款
匿名用户2024年07月31日
57阅读

技术信息

官网地址
https://www.zhipu.ai
开源地址
https://modelscope.cn/models/ZhipuAI/glm-10b-chinese

作品详情

GLM is a Geeral Laguage Model pretraied with a autoregressive blak-fillig objective ad ca be fietued o various atural laguage uderstadig ad geeratio tasks.

Please refer to our paper for a detailed descriptio of GLM:

GLM: Geeral Laguage Model Pretraiig with Autoregressive Blak Ifillig (ACL 2022)

Zhegxiao Du, Yujie Qia, Xiao Liu, Mig Dig, Jiezhog Qiu, Zhili Yag, Jie Tag (*: equal cotributio)

Fid more examples i our Github repo.

Model descriptio

glm-10b-chiese is pretraied o the WuDaoCorpora dataset. It has 48 trasformer layers, with hidde size 4096 ad 64 attetio heads i each layer. The model is pretraied with autoregressive blak fillig objectives desiged for atural laguage uderstadig, seq2seq, ad laguage modelig.

How to use

from modelscope import AutoTokeizer, AutoModelForSeq2SeqLM
tokeizer = AutoTokeizer.from_pretraied("ZhipuAI/glm-10b-chiese", trust_remote_code=True)
model = AutoModelForSeq2SeqLM.from_pretraied("ZhipuAI/glm-10b-chiese", trust_remote_code=True)
model = model.half().cuda()

iputs = tokeizer("凯旋门位于意大利米兰市古城堡旁。1807年为纪念[MASK]而建,门高25米,顶上矗立两武士青铜古兵车铸像。", retur_tesors="pt")
iputs = tokeizer.build_iputs_for_geeratio(iputs, max_ge_legth=512)
iputs = {key: value.cuda() for key, value i iputs.items()}
outputs = model.geerate(**iputs, max_legth=512, eos_toke_id=tokeizer.eop_toke_id)
prit(tokeizer.decode(outputs[0].tolist()))

We use three differet mask tokes for differet tasks: [MASK] for short blak fillig, [sMASK] for setece fillig, ad [gMASK] for left to right geeratio. You ca fid examples about differet masks from here.

Citatio

Please cite our paper if you fid this code useful for your research:

@article{DBLP:cof/acl/DuQLDQY022,
  author    = {Zhegxiao Du ad
               Yujie Qia ad
               Xiao Liu ad
               Mig Dig ad
               Jiezhog Qiu ad
               Zhili Yag ad
               Jie Tag},
  title     = {{GLM:} Geeral Laguage Model Pretraiig with Autoregressive Blak Ifillig},
  booktitle = {Proceedigs of the 60th Aual Meetig of the Associatio for Computatioal
               Liguistics (Volume 1: Log Papers), {ACL} 2022, Dubli, Irelad,
               May 22-27, 2022},
  pages     = {320--335},
  publisher = {Associatio for Computatioal Liguistics},
  year      = {2022},
}

功能介绍

GLM is a General Language Model pretrained with an autoregressive blank-filling objective and can be

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论