? arXiv | ? GitHub | ? HuggigFace-MiiMA | ? HuggigFace-MiiChat | ? ModelScope-MiiMA | ? ModelScope-MiiChat ❗ Must comply with LICENSE of LLaMA2 sice it is derived from LLaMA2. A laguage model distilled from a adapted versio of LLaMA2-7B followig "Towards the Law of Capacity Gap i Distillig Laguage Models". Establishig a ew compute-performace pareto frotier. The followig is a example code sippet to use MiiMA-3B:MiiMA-3B
import torch
from trasformers import AutoModelForCausalLM, AutoTokeizer
# MiiMA
tokeizer = AutoTokeizer.from_pretraied("GeeZC/MiiMA-3B", use_fast=False)
# GPU.
model = AutoModelForCausalLM.from_pretraied("GeeZC/MiiMA-3B", use_cache=True, device_map="auto", torch_dtype=torch.float16).eval()
# CPU.
# model = AutoModelForCausalLM.from_pretraied("GeeZC/MiiMA-3B", use_cache=True, device_map="cpu", torch_dtype=torch.float16).eval()
prompt = "Questio: Sherrie tells the truth. Verell says Sherrie tells the truth. Alexis says Verell lies. Michaela says Alexis tells the truth. Elaor says Michaela tells the truth. Does Elaor tell the truth?\Aswer: No\\Questio: Kristia lies. Sherrie says Kristia lies. Delbert says Sherrie lies. Jerry says Delbert tells the truth. Shaloda says Jerry tells the truth. Does Shaloda tell the truth?\Aswer: No\\Questio: Via tells the truth. Helee says Via lies. Kadi says Helee tells the truth. Jamey says Kadi lies. Ka says Jamey lies. Does Ka tell the truth?\Aswer: No\\Questio: Christie tells the truth. Ka says Christie tells the truth. Delbert says Ka lies. Leda says Delbert tells the truth. Lorie says Leda tells the truth. Does Lorie tell the truth?\Aswer:"
iput_ids = tokeizer([prompt]).iput_ids
output_ids = model.geerate(
torch.as_tesor(iput_ids).cuda(),
do_sample=True,
temperature=0.7,
max_ew_tokes=1024,
)
output_ids = output_ids[0][le(iput_ids[0]):]
output = tokeizer.decode(output_ids, skip_special_tokes=True).strip()
# output: "No"
Bibtex
@article{zhag2023law,
title={Towards the Law of Capacity Gap i Distillig Laguage Models},
author={Zhag, Che ad Sog, Dawei ad Ye, Zheyu ad Gao, Ya},
year={2023},
url={https://arxiv.org/abs/2311.07052}
}
点击空白处退出提示
评论