PoNet预训练模型-中文-base

我要开发同款
匿名用户2024年07月31日
69阅读

技术信息

开源地址
https://modelscope.cn/models/iic/nlp_ponet_fill-mask_chinese-base
授权协议
Apache License 2.0

作品详情

PoNet完形填空模型-中文-base介绍

本模型选用PoNet模型结构,通过Masked Laguage Modelig(MLM)和Setece Structural Objective(SSO)预训练任务在中文Wikipedia数据预训练获得,可以用于完形填空任务,也可以作为初始化模型在下游自然语言理解任务上fietue来使用。

模型描述

PoNet是一种具有线性复杂度(O(N))的计算模型,使用poolig网络替代Trasformer模型中的self-attetio来对句子词汇进行混合,具体包括在local、segmet、global三个粒度上的poolig网络,从而捕捉上下文信息。 其结构如下图所示。

PoNet

实验表明,PoNet在长文本测试Log Rage Area(LRA)榜上在准确率上比Trasformer高2.28个点,在GPU上运行速度是Trasformer的9倍,显存占用只有1/10。此外,实验也展示了PoNet的迁移学习能力,PoNet-Base在GLUE基准上达到了BERT-Base的95.7%的准确性。 详见论文PoNet: Poolig Network for Efficiet Toke Mixig i Log Sequeces

期望模型使用方式以及适用范围

本模型主要用于生成完形填空的结果。用户可以自行尝试各种输入文档。具体调用方式请参考代码示例。

如何使用

在安装完成ModelScope-lib之后即可使用lppoetfill-mask_chiese-base的能力

代码范例

from modelscope.pipelies import pipelie
from modelscope.utils.costat import Tasks

pipelie_is = pipelie(Tasks.fill_mask, model='damo/lp_poet_fill-mask_chiese-base')
iput = "人民文学出版社[MASK]1952[MASK],出版《[MASK][MASK]演义》、《[MASK]游记》、《水浒传》、《[MASK]楼梦》,合为“[MASK]大名著”。"
prit(pipelie_is(iput))

模型局限性以及可能的偏差

  • 模型训练数据有限,效果可能存在一定偏差。
  • 当前版本在pytorch 1.11和pytorch 1.12环境测试通过,其他环境可用性待测试

训练数据介绍

数据来源于https://dumps.wikimedia.org/

模型训练流程

在中文Wikipedia的无监督数据上,通过MLM和SSO任务训练得到。

预处理

对于训练数据会采用如下预处理,对于MLM任务,掩蔽概率设置为15%。80%的掩蔽位置被[MASK]替换,10%被替换为随机抽样的单词,剩下的10%不变。对于SSO任务,包含多个段落的序列在随机位置被截断为两个子序列,其中 1/3概率用另一个随机选择的子序列替换其中一个子序列,1/3的概率交换两个子序列,1/3的概率不变。

训练细节

在中文Wikipedia上使用Adam优化器,初始学习率为1e-4,batch_size为384。

数据评估及结果

在下游任务fietue后,CAILCLUE的开发集结果如下:

Dataset CAIL AFQMC CMNLI CSL IFLYTEK OCNLI TNEWS WSC
Accuracy 61.93 70.25 72.9 72.97 58.21 68.14 55.04 64.47

在下游任务MUG的Topic Segmetatio 和 Topic-level ad Sessio-level Extractive Summarizatio的开发集结果如下:

Task Positive F1
Topic Segmetatio 0.251
Task Ave. R1 Ave. R2 Ave. RL Max R1 Max R2 Max RL
Sessio-Level ES 57.08 29.90 38.36 62.20 37.34 46.98
Topic-Level ES 52.86 35.80 46.09 66.67 54.05 63.14

More Details: https://github.com/alibaba-damo-academy/SpokeNLP

相关论文以及引用信息

如果我们的模型对您有帮助,请您引用我们的文章:

@iproceedigs{DBLP:jourals/corr/abs-2110-02442,
  author    = {Chao{-}Hog Ta ad
               Qia Che ad
               We Wag ad
               Qigli Zhag ad
               Siqi Zheg ad
               Zhe{-}Hua Lig},
  title     = {{PoNet}: Poolig Network for Efficiet Toke Mixig i Log Sequeces},
  booktitle = {10th Iteratioal Coferece o Learig Represetatios, {ICLR} 2022,
               Virtual Evet, April 25-29, 2022},
  publisher = {OpeReview.et},
  year      = {2022},
  url       = {https://opereview.et/forum?id=9jID9JjicF},
}

Itroductio

This model uses the PoNet structure, which is pre-traied o Chiese Wikipedia data through Masked Laguage Modelig (MLM) ad Setece Structural Objective (SSO) pre-traiig tasks. It ca be used for cloze tasks, ad ca also be used as a iitializatio model for dowstream atural laguage uderstadig.

Model descriptio

PoNet is a poolig etwork with liear complexity (O(N)), which uses poolig etwork istead of self-attetio i Trasformer model to mix tokes. It uses multi-graularity poolig ad poolig fusio to capture differet levels of cotextual iformatio ad combie their iteractios with tokes. The structure is show i the figure below.

PoNet

Expected model usage ad scope of applicatio

This model is maily used to geerate cloze results. Users ca try various iput documets by themselves. Please refer to the code example for the specific callig method.

How to use

After istallig ModelScope-lib, you ca use the ability of lppoetfill-mask_chiese-base.

Code example

from modelscope.pipelies import pipelie
from modelscope.utils.costat import Tasks

pipelie_is = pipelie(Tasks.fill_mask, model='damo/lp_poet_fill-mask_chiese-base')
iput = "人民文学出版社[MASK]1952[MASK],出版《[MASK][MASK]演义》、《[MASK]游记》、《水浒传》、《[MASK]楼梦》,合为“[MASK]大名著”。"
prit(pipelie_is(iput))

Model limitatios ad possible bias

  • The model traiig data is limited, ad the effect may have certai deviatios.
  • The curret versio has passed the test i pytorch 1.11 ad pytorch 1.12 eviromets, ad the usability of other eviromets is yet to be tested.

Traiig data itroductio

The data comes from https://dumps.wikimedia.org/

Model traiig

O the usupervised data of Chiese Wikipedia, it is traied by MLM ad SSO tasks.

Preprocessig

For the traiig data, the followig preprocessig is used. For the MLM task, the maskig probability is set to 15%. 80% of the masked positios are replaced by [MASK], 10% are replaced by radomly sampled words, ad the remaiig 10% are uchaged. For the SSO task, a log sequece cotaiig several paragraphs is trucated ito two subsequeces at radom positios, with 1/3 probability of replacig oe of the subsequeces with aother radomly selected subsequece, 1/3 probability of swappig the two subsequeces, ad 1/3 probability uchaged. These three cases are assiged three differet labels for the terary classificatio.

Traiig detail

Usig the Adam optimizer o Chiese Wikipedia, the iitial learig rate is 1e-4, ad the batch size is 384.

Data evaluatio ad results

After the dowstream task fietue, the developmet set results of CAIL ad CLUE are as follows:

Dataset CAIL AFQMC CMNLI CSL IFLYTEK OCNLI TNEWS WSC
Accuracy 61.93 70.25 72.9 72.97 58.21 68.14 55.04 64.47

The developmet set results of Topic Segmetatio ad Topic-level ad Sessio-level Extractive Summarizatio i the dowstream task MUG are as follows:

Task Positive F1
Topic Segmetatio 0.251
Task Ave. R1 Ave. R2 Ave. RL Max R1 Max R2 Max RL
Sessio-Level ES 57.08 29.90 38.36 62.20 37.34 46.98
Topic-Level ES 52.86 35.80 46.09 66.67 54.05 63.14

More Details: https://github.com/alibaba-damo-academy/SpokeNLP

Related work ad citatio iformatio

If our model is helpful to you, please cite our paper:

@iproceedigs{DBLP:jourals/corr/abs-2110-02442,
  author    = {Chao{-}Hog Ta ad
               Qia Che ad
               We Wag ad
               Qigli Zhag ad
               Siqi Zheg ad
               Zhe{-}Hua Lig},
  title     = {{PoNet}: Poolig Network for Efficiet Toke Mixig i Log Sequeces},
  booktitle = {10th Iteratioal Coferece o Learig Represetatios, {ICLR} 2022,
               Virtual Evet, April 25-29, 2022},
  publisher = {OpeReview.et},
  year      = {2022},
  url       = {https://opereview.et/forum?id=9jID9JjicF},
}

功能介绍

PoNet完形填空模型-中文-base介绍 本模型选用PoNet模型结构,通过Masked Language Modeling(MLM)和Sentence Structural Objective(S

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论