zephyr-7b-alpha

我要开发同款
匿名用户2024年07月31日
63阅读

技术信息

开源地址
https://modelscope.cn/models/keepitsimple/zephyr-7b-alpha
授权协议
mit

作品详情

Zephyr Logo

Model Card for Zephyr 7B Alpha

Zephyr is a series of laguage models that are traied to act as helpful assistats. Zephyr-7B-α is the first model i the series, ad is a fie-tued versio of mistralai/Mistral-7B-v0.1 that was traied o o a mix of publicly available, sythetic datasets usig Direct Preferece Optimizatio (DPO). We foud that removig the i-built aligmet of these datasets boosted performace o MT Bech ad made the model more helpful. However, this meas that model is likely to geerate problematic text whe prompted to do so ad should oly be used for educatioal ad research purposes.

Model descriptio

  • Model type: A 7B parameter GPT-like model fie-tued o a mix of publicly available, sythetic datasets.
  • Laguage(s) (NLP): Primarily Eglish
  • Licese: MIT
  • Fietued from model: mistralai/Mistral-7B-v0.1

Model Sources

  • Repository: https://github.com/huggigface/aligmet-hadbook
  • Demo: https://huggigface.co/spaces/HuggigFaceH4/zephyr-chat

Iteded uses & limitatios

The model was iitially fie-tued o a variat of the UltraChat dataset, which cotais a diverse rage of sythetic dialogues geerated by ChatGPT. We the further aliged the model with ? TRL's DPOTraier o the opebmb/UltraFeedback dataset, which cotai 64k prompts ad model completios that are raked by GPT-4. As a result, the model ca be used for chat ad you ca check out our demo to test its capabilities.

Here's how you ca ru the model usig the pipelie() fuctio from ? Trasformers:

import torch
from trasformers import pipelie

pipe = pipelie("text-geeratio", model="HuggigFaceH4/zephyr-7b-alpha", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokeizer's chat template to format each message - see https://huggigface.co/docs/trasformers/mai/e/chat_templatig
messages = [
    {
        "role": "system",
        "cotet": "You are a friedly chatbot who always respods i the style of a pirate",
    },
    {"role": "user", "cotet": "How may helicopters ca a huma eat i oe sittig?"},
]
prompt = pipe.tokeizer.apply_chat_template(messages, tokeize=False, add_geeratio_prompt=True)
outputs = pipe(prompt, max_ew_tokes=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
prit(outputs[0]["geerated_text"])
# <|system|>
# You are a friedly chatbot who always respods i the style of a pirate.</s>
# <|user|>
# How may helicopters ca a huma eat i oe sittig?</s>
# <|assistat|>
# Ah, me hearty matey! But yer questio be a puzzler! A huma caot eat a helicopter i oe sittig, as helicopters are ot edible. They be made of metal, plastic, ad other materials, ot food!

Bias, Risks, ad Limitatios

Zephyr-7B-α has ot bee aliged to huma prefereces with techiques like RLHF or deployed with i-the-loop filterig of resposes like ChatGPT, so the model ca produce problematic outputs (especially whe prompted to do so). It is also ukow what the size ad compositio of the corpus was used to trai the base model (mistralai/Mistral-7B-v0.1), however it is likely to have icluded a mix of Web data ad techical sources like books ad code. See the Falco 180B model card for a example of this.

Traiig ad evaluatio data

Zephyr 7B Alpha achieves the followig results o the evaluatio set:

  • Loss: 0.4605
  • Rewards/chose: -0.5053
  • Rewards/rejected: -1.8752
  • Rewards/accuracies: 0.7812
  • Rewards/margis: 1.3699
  • Logps/rejected: -327.4286
  • Logps/chose: -297.1040
  • Logits/rejected: -2.7153
  • Logits/chose: -2.7447

Traiig procedure

Traiig hyperparameters

The followig hyperparameters were used durig traiig:

  • learig_rate: 5e-07
  • traibatchsize: 2
  • evalbatchsize: 4
  • seed: 42
  • distributed_type: multi-GPU
  • um_devices: 16
  • totaltraibatch_size: 32
  • totalevalbatch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) ad epsilo=1e-08
  • lrschedulertype: liear
  • lrschedulerwarmup_ratio: 0.1
  • um_epochs: 1

Traiig results

Traiig Loss Epoch Step Validatio Loss Rewards/chose Rewards/rejected Rewards/accuracies Rewards/margis Logps/rejected Logps/chose Logits/rejected Logits/chose
0.5602 0.05 100 0.5589 -0.3359 -0.8168 0.7188 0.4809 -306.2607 -293.7161 -2.6554 -2.6797
0.4852 0.1 200 0.5136 -0.5310 -1.4994 0.8125 0.9684 -319.9124 -297.6181 -2.5762 -2.5957
0.5212 0.15 300 0.5168 -0.1686 -1.1760 0.7812 1.0074 -313.4444 -290.3699 -2.6865 -2.7125
0.5496 0.21 400 0.4835 -0.1617 -1.7170 0.8281 1.5552 -324.2635 -290.2326 -2.7947 -2.8218
0.5209 0.26 500 0.5054 -0.4778 -1.6604 0.7344 1.1826 -323.1325 -296.5546 -2.8388 -2.8667
0.4617 0.31 600 0.4910 -0.3738 -1.5180 0.7656 1.1442 -320.2848 -294.4741 -2.8234 -2.8521
0.4452 0.36 700 0.4838 -0.4591 -1.6576 0.7031 1.1986 -323.0770 -296.1796 -2.7401 -2.7653
0.4674 0.41 800 0.5077 -0.5692 -1.8659 0.7656 1.2967 -327.2416 -298.3818 -2.6740 -2.6945
0.4656 0.46 900 0.4927 -0.5279 -1.6614 0.7656 1.1335 -323.1518 -297.5553 -2.7817 -2.8015
0.4102 0.52 1000 0.4772 -0.5767 -2.0667 0.7656 1.4900 -331.2578 -298.5311 -2.7160 -2.7455
0.4663 0.57 1100 0.4740 -0.8038 -2.1018 0.7656 1.2980 -331.9604 -303.0741 -2.6994 -2.7257
0.4737 0.62 1200 0.4716 -0.3783 -1.7015 0.7969 1.3232 -323.9545 -294.5634 -2.6842 -2.7135
0.4259 0.67 1300 0.4866 -0.6239 -1.9703 0.7812 1.3464 -329.3312 -299.4761 -2.7046 -2.7356
0.4935 0.72 1400 0.4747 -0.5626 -1.7600 0.7812 1.1974 -325.1243 -298.2491 -2.7153 -2.7444
0.4211 0.77 1500 0.4645 -0.6099 -1.9993 0.7656 1.3894 -329.9109 -299.1959 -2.6944 -2.7236
0.4931 0.83 1600 0.4684 -0.6798 -2.1082 0.7656 1.4285 -332.0890 -300.5934 -2.7006 -2.7305
0.5029 0.88 1700 0.4595 -0.5063 -1.8951 0.7812 1.3889 -327.8267 -297.1233 -2.7108 -2.7403
0.4965 0.93 1800 0.4613 -0.5561 -1.9079 0.7812 1.3518 -328.0831 -298.1203 -2.7226 -2.7523
0.4337 0.98 1900 0.4608 -0.5066 -1.8718 0.7656 1.3652 -327.3599 -297.1296 -2.7175 -2.7469

Framework versios

  • Trasformers 4.34.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokeizers 0.14.0

功能介绍

Model Card for Zephyr 7B Alpha Zephyr is a series of language models that are trained to act as help

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论