Model Card for Zephyr 7B Alpha

Zephyr is a series of laguage models that are traied to act as helpful assistats. Zephyr-7B-α is the first model i the series, ad is a fie-tued versio of mistralai/Mistral-7B-v0.1 that was traied o o a mix of publicly available, sythetic datasets usig Direct Preferece Optimizatio (DPO). We foud that removig the i-built aligmet of these datasets boosted performace o MT Bech ad made the model more helpful. However, this meas that model is likely to geerate problematic text whe prompted to do so ad should oly be used for educatioal ad research purposes.

Model descriptio

Model type: A 7B parameter GPT-like model fie-tued o a mix of publicly available, sythetic datasets.
Laguage(s) (NLP): Primarily Eglish
Licese: MIT
Fietued from model: mistralai/Mistral-7B-v0.1

Model Sources

Repository: https://github.com/huggigface/aligmet-hadbook
Demo: https://huggigface.co/spaces/HuggigFaceH4/zephyr-chat

Iteded uses & limitatios

The model was iitially fie-tued o a variat of the UltraChat dataset, which cotais a diverse rage of sythetic dialogues geerated by ChatGPT. We the further aliged the model with ? TRL's DPOTraier o the opebmb/UltraFeedback dataset, which cotai 64k prompts ad model completios that are raked by GPT-4. As a result, the model ca be used for chat ad you ca check out our demo to test its capabilities.

Here's how you ca ru the model usig the pipelie() fuctio from ? Trasformers:

import torch
from trasformers import pipelie

pipe = pipelie("text-geeratio", model="HuggigFaceH4/zephyr-7b-alpha", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokeizer's chat template to format each message - see https://huggigface.co/docs/trasformers/mai/e/chat_templatig
messages = [
    {
        "role": "system",
        "cotet": "You are a friedly chatbot who always respods i the style of a pirate",
    },
    {"role": "user", "cotet": "How may helicopters ca a huma eat i oe sittig?"},
]
prompt = pipe.tokeizer.apply_chat_template(messages, tokeize=False, add_geeratio_prompt=True)
outputs = pipe(prompt, max_ew_tokes=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
prit(outputs[0]["geerated_text"])
# <|system|>
# You are a friedly chatbot who always respods i the style of a pirate.</s>
# <|user|>
# How may helicopters ca a huma eat i oe sittig?</s>
# <|assistat|>
# Ah, me hearty matey! But yer questio be a puzzler! A huma caot eat a helicopter i oe sittig, as helicopters are ot edible. They be made of metal, plastic, ad other materials, ot food!

Bias, Risks, ad Limitatios

Zephyr-7B-α has ot bee aliged to huma prefereces with techiques like RLHF or deployed with i-the-loop filterig of resposes like ChatGPT, so the model ca produce problematic outputs (especially whe prompted to do so). It is also ukow what the size ad compositio of the corpus was used to trai the base model (mistralai/Mistral-7B-v0.1), however it is likely to have icluded a mix of Web data ad techical sources like books ad code. See the Falco 180B model card for a example of this.

Traiig ad evaluatio data

Zephyr 7B Alpha achieves the followig results o the evaluatio set:

Loss: 0.4605
Rewards/chose: -0.5053
Rewards/rejected: -1.8752
Rewards/accuracies: 0.7812
Rewards/margis: 1.3699
Logps/rejected: -327.4286
Logps/chose: -297.1040
Logits/rejected: -2.7153
Logits/chose: -2.7447

Traiig procedure

Traiig hyperparameters

The followig hyperparameters were used durig traiig:

learig_rate: 5e-07
traibatchsize: 2
evalbatchsize: 4
seed: 42
distributed_type: multi-GPU
um_devices: 16
totaltraibatch_size: 32
totalevalbatch_size: 64
optimizer: Adam with betas=(0.9,0.999) ad epsilo=1e-08
lrschedulertype: liear
lrschedulerwarmup_ratio: 0.1
um_epochs: 1

Traiig results

Traiig Loss	Epoch	Step	Validatio Loss	Rewards/chose	Rewards/rejected	Rewards/accuracies	Rewards/margis	Logps/rejected	Logps/chose	Logits/rejected	Logits/chose
0.5602	0.05	100	0.5589	-0.3359	-0.8168	0.7188	0.4809	-306.2607	-293.7161	-2.6554	-2.6797
0.4852	0.1	200	0.5136	-0.5310	-1.4994	0.8125	0.9684	-319.9124	-297.6181	-2.5762	-2.5957
0.5212	0.15	300	0.5168	-0.1686	-1.1760	0.7812	1.0074	-313.4444	-290.3699	-2.6865	-2.7125
0.5496	0.21	400	0.4835	-0.1617	-1.7170	0.8281	1.5552	-324.2635	-290.2326	-2.7947	-2.8218
0.5209	0.26	500	0.5054	-0.4778	-1.6604	0.7344	1.1826	-323.1325	-296.5546	-2.8388	-2.8667
0.4617	0.31	600	0.4910	-0.3738	-1.5180	0.7656	1.1442	-320.2848	-294.4741	-2.8234	-2.8521
0.4452	0.36	700	0.4838	-0.4591	-1.6576	0.7031	1.1986	-323.0770	-296.1796	-2.7401	-2.7653
0.4674	0.41	800	0.5077	-0.5692	-1.8659	0.7656	1.2967	-327.2416	-298.3818	-2.6740	-2.6945
0.4656	0.46	900	0.4927	-0.5279	-1.6614	0.7656	1.1335	-323.1518	-297.5553	-2.7817	-2.8015
0.4102	0.52	1000	0.4772	-0.5767	-2.0667	0.7656	1.4900	-331.2578	-298.5311	-2.7160	-2.7455
0.4663	0.57	1100	0.4740	-0.8038	-2.1018	0.7656	1.2980	-331.9604	-303.0741	-2.6994	-2.7257
0.4737	0.62	1200	0.4716	-0.3783	-1.7015	0.7969	1.3232	-323.9545	-294.5634	-2.6842	-2.7135
0.4259	0.67	1300	0.4866	-0.6239	-1.9703	0.7812	1.3464	-329.3312	-299.4761	-2.7046	-2.7356
0.4935	0.72	1400	0.4747	-0.5626	-1.7600	0.7812	1.1974	-325.1243	-298.2491	-2.7153	-2.7444
0.4211	0.77	1500	0.4645	-0.6099	-1.9993	0.7656	1.3894	-329.9109	-299.1959	-2.6944	-2.7236
0.4931	0.83	1600	0.4684	-0.6798	-2.1082	0.7656	1.4285	-332.0890	-300.5934	-2.7006	-2.7305
0.5029	0.88	1700	0.4595	-0.5063	-1.8951	0.7812	1.3889	-327.8267	-297.1233	-2.7108	-2.7403
0.4965	0.93	1800	0.4613	-0.5561	-1.9079	0.7812	1.3518	-328.0831	-298.1203	-2.7226	-2.7523
0.4337	0.98	1900	0.4608	-0.5066	-1.8718	0.7656	1.3652	-327.3599	-297.1296	-2.7175	-2.7469

Framework versios

Trasformers 4.34.0
Pytorch 2.0.1+cu118
Datasets 2.12.0
Tokeizers 0.14.0

Model Card for Zephyr 7B Alpha Zephyr is a series of language models that are trained to act as help

声明：本文仅代表作者观点，不代表本站立场。如果侵犯到您的合法权益，请联系我们删除侵权资源！如果遇到资源链接失效，请您通过评论或工单的方式通知管理员。未经允许，不得转载，本站所有资源文章禁止商业使用运营!

下载安装【程序员客栈】APP

实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

前往安装

zephyr-7b-alpha

技术信息

作品详情