MiniMA-3B_开源AI项目-程序员客栈

开源地址
https://modelscope.cn/models/GeneZC/MiniMA-3B授权协议
Apache License 2.0

MiiMA-3B

❗ Must comply with LICENSE of LLaMA2 sice it is derived from LLaMA2.

A laguage model distilled from a adapted versio of LLaMA2-7B followig "Towards the Law of Capacity Gap i Distillig Laguage Models".

Establishig a ew compute-performace pareto frotier.

teaser_a

The followig is a example code sippet to use MiiMA-3B:

import torch

from trasformers import AutoModelForCausalLM, AutoTokeizer

# MiiMA
tokeizer = AutoTokeizer.from_pretraied("GeeZC/MiiMA-3B", use_fast=False)
# GPU.
model = AutoModelForCausalLM.from_pretraied("GeeZC/MiiMA-3B", use_cache=True, device_map="auto", torch_dtype=torch.float16).eval()
# CPU.
# model = AutoModelForCausalLM.from_pretraied("GeeZC/MiiMA-3B", use_cache=True, device_map="cpu", torch_dtype=torch.float16).eval()

prompt = "Questio: Sherrie tells the truth. Verell says Sherrie tells the truth. Alexis says Verell lies. Michaela says Alexis tells the truth. Elaor says Michaela tells the truth. Does Elaor tell the truth?\Aswer: No\\Questio: Kristia lies. Sherrie says Kristia lies. Delbert says Sherrie lies. Jerry says Delbert tells the truth. Shaloda says Jerry tells the truth. Does Shaloda tell the truth?\Aswer: No\\Questio: Via tells the truth. Helee says Via lies. Kadi says Helee tells the truth. Jamey says Kadi lies. Ka says Jamey lies. Does Ka tell the truth?\Aswer: No\\Questio: Christie tells the truth. Ka says Christie tells the truth. Delbert says Ka lies. Leda says Delbert tells the truth. Lorie says Leda tells the truth. Does Lorie tell the truth?\Aswer:"
iput_ids = tokeizer([prompt]).iput_ids
output_ids = model.geerate(
    torch.as_tesor(iput_ids).cuda(),
    do_sample=True,
    temperature=0.7,
    max_ew_tokes=1024,
)
output_ids = output_ids[0][le(iput_ids[0]):]
output = tokeizer.decode(output_ids, skip_special_tokes=True).strip()
# output: "No"

Bibtex

@article{zhag2023law,
    title={Towards the Law of Capacity Gap i Distillig Laguage Models},
    author={Zhag, Che ad Sog, Dawei ad Ye, Zheyu ad Gao, Ya},
    year={2023},
    url={https://arxiv.org/abs/2311.07052}
}

声明：本文仅代表作者观点，不代表本站立场。如果侵犯到您的合法权益，请联系我们删除侵权资源！如果遇到资源链接失效，请您通过评论或工单的方式通知管理员。未经允许，不得转载，本站所有资源文章禁止商业使用运营!

下载安装【程序员客栈】APP

实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

前往安装

MiniMA-3B

技术信息

作品详情

MiiMA-3B

Bibtex

功能介绍

重点城市程序员兼职推荐

重点岗位程序员兼职推荐