匿名用户2024年07月31日
60阅读

技术信息

开源地址
https://modelscope.cn/models/GeneZC/MiniMA-3B
授权协议
Apache License 2.0

作品详情

MiiMA-3B

? arXiv | ? GitHub | ? HuggigFace-MiiMA | ? HuggigFace-MiiChat | ? ModelScope-MiiMA | ? ModelScope-MiiChat

❗ Must comply with LICENSE of LLaMA2 sice it is derived from LLaMA2.

A laguage model distilled from a adapted versio of LLaMA2-7B followig "Towards the Law of Capacity Gap i Distillig Laguage Models".

Establishig a ew compute-performace pareto frotier.

teaser_a

The followig is a example code sippet to use MiiMA-3B:

import torch

from trasformers import AutoModelForCausalLM, AutoTokeizer

# MiiMA
tokeizer = AutoTokeizer.from_pretraied("GeeZC/MiiMA-3B", use_fast=False)
# GPU.
model = AutoModelForCausalLM.from_pretraied("GeeZC/MiiMA-3B", use_cache=True, device_map="auto", torch_dtype=torch.float16).eval()
# CPU.
# model = AutoModelForCausalLM.from_pretraied("GeeZC/MiiMA-3B", use_cache=True, device_map="cpu", torch_dtype=torch.float16).eval()

prompt = "Questio: Sherrie tells the truth. Verell says Sherrie tells the truth. Alexis says Verell lies. Michaela says Alexis tells the truth. Elaor says Michaela tells the truth. Does Elaor tell the truth?\Aswer: No\\Questio: Kristia lies. Sherrie says Kristia lies. Delbert says Sherrie lies. Jerry says Delbert tells the truth. Shaloda says Jerry tells the truth. Does Shaloda tell the truth?\Aswer: No\\Questio: Via tells the truth. Helee says Via lies. Kadi says Helee tells the truth. Jamey says Kadi lies. Ka says Jamey lies. Does Ka tell the truth?\Aswer: No\\Questio: Christie tells the truth. Ka says Christie tells the truth. Delbert says Ka lies. Leda says Delbert tells the truth. Lorie says Leda tells the truth. Does Lorie tell the truth?\Aswer:"
iput_ids = tokeizer([prompt]).iput_ids
output_ids = model.geerate(
    torch.as_tesor(iput_ids).cuda(),
    do_sample=True,
    temperature=0.7,
    max_ew_tokes=1024,
)
output_ids = output_ids[0][le(iput_ids[0]):]
output = tokeizer.decode(output_ids, skip_special_tokes=True).strip()
# output: "No"

Bibtex

@article{zhag2023law,
    title={Towards the Law of Capacity Gap i Distillig Laguage Models},
    author={Zhag, Che ad Sog, Dawei ad Ye, Zheyu ad Gao, Ya},
    year={2023},
    url={https://arxiv.org/abs/2311.07052}
}

功能介绍

MiniMA-3B ? arXiv | ? GitHub | ? HuggingFace-MiniMA | ? HuggingFace-MiniChat | ? ModelScope-MiniMA |

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论