We propose the PEG model (a Progressively Leared Textual Embeddig), which progressively adjusts the weights of samples cotributig to the loss withi a extremely large batch, based o the difficulty levels of egative samples.
we have amassed a extesive collectio of over 110 millio data, spaig a wide rage of fields such as geeral kowledge, fiace, tourism, medicie, ad more. Our techical report is available at Paper Istall modelscope: The load model ad predict: If you have ay questio or suggestio related to this project, feel free to ope a issue or pull request.
You also ca email Tog Wu(towswu@tecet.com). If you fid our work helpful for your research, please cosider citig the followig BibTeX etry:PEG: Towards Robust Text Retrieval with Progressive Learig
Model Details
Usage
pip istall modelscope
from modelscope import AutoModel, AutoTokeizer
import torch
# Load model from HuggigFace Hub
tokeizer = AutoTokeizer.from_pretraied('TowsWu/PEG')
model = AutoModel.from_pretraied('TowsWu/PEG')
seteces = ['如何更换花呗绑定银行卡', '花呗更改绑定银行卡']
# Tokeize seteces
iputs = tokeizer(seteces, paddig=True, trucatio=True, retur_tesors='pt')
# Compute toke embeddigs
with torch.o_grad():
last_hidde_state = model(**iputs, retur_dict=True).last_hidde_state
embeddigs = last_hidde_state[:, 0]
prit("embeddigs:")
prit(embeddigs)
Cotact
Citatio
@article{wu2023towards,
title={Towards Robust Text Retrieval with Progressive Learig},
author={Wu, Tog ad Qi, Yulei ad Zhag, Ewei ad Xu, Ziha ad Gao, Yutig ad Li, Ke ad Su, Xig},
joural={arXiv preprit arXiv:2311.11691},
year={2023}
}
点击空白处退出提示
评论