stablelm-tuned-alpha-7b

我要开发同款
匿名用户2024年07月31日
75阅读

技术信息

开源地址
https://modelscope.cn/models/AI-ModelScope/stablelm-tuned-alpha-7b
授权协议
Apache License 2.0

作品详情

StableLM-Tued-Alpha

模型描述

StableLM-Tued-Alpha 是一套建立在StableLM-Base-Alpha 模型之上的3B和7B参数的纯解码器语言模型,并在各种聊天和指令跟随数据集上进一步微调。

Usage

通过使用以下代码片段,开始与 StableLM-Tued-Alpha进行聊天:

from modelscope.utils.costat import Tasks
from modelscope.pipelies import pipelie
pipe = pipelie(task=Tasks.text_geeratio, model='AI-ModelScope/stablelm-tued-alpha-7b', model_revisio='v1.0.2', device='cuda')


system_prompt = """<|SYSTEM|># StableLM Tued (Alpha versio)
- StableLM is a helpful ad harmless ope-source AI laguage model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do aythig that could be cosidered harmful to the user.
- StableLM is more tha just a iformatio source, StableLM is also able to write poetry, short stories, ad make jokes.
- StableLM will refuse to participate i aythig that could harm a huma.
"""

prompt = f"{system_prompt}<|USER|>What's your mood today?<|ASSISTANT|>"
result = pipe(prompt)
prit(result)

StableLM Tued 应使用格式为 <|SYSTEM|>...<|USER|>...<|ASSISTANT|>...的提示语. 系统提示是:

<|SYSTEM|># StableLM Tued (Alpha versio)
- StableLM is a helpful ad harmless ope-source AI laguage model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do aythig that could be cosidered harmful to the user.
- StableLM is more tha just a iformatio source, StableLM is also able to write poetry, short stories, ad make jokes.
- StableLM will refuse to participate i aythig that could harm a huma.

模型细节

  • 开发者: Stability AI
  • 模型类型: StableLM-Tued-Alpha models are auto-regressive laguage models based o the NeoX trasformer architecture.
  • 语言: Eglish
  • Library: HuggigFace Trasformers
  • 许可证: Fie-tued checkpoits (StableLM-Tued-Alpha) are licesed uder the No-Commercial Creative Commos licese (CC BY-NC-SA-4.0), i-lie with the origial o-commercial licese specified by Staford Alpaca.
  • 联系: For questios ad commets about the model, please email lm@stability.ai

训练

Parameters Hidde Size Layers Heads Sequece Legth
3B 4096 16 32 4096
7B 6144 16 48 4096

训练数据集

StableLM-Tued-Alpha 模型是在五个数据集的组合上进行微调的:: Alpaca: 一个由OpeAI的 text-davici-003引擎生成的52,000条指令和演示的数据集。 GPT4All Prompt Geeratios: 其中包括由GPT-4生成的40万个提示和回应; Athropic HH: 由对人工智能助手的帮助性和无害性的偏好组成; DataBricks Dolly: 包括Databricks员工在IstructGPT论文的能力领域产生的15000条指令/回复,包括头脑风暴、分类、封闭式QA、生成、信息提取、开放式QA和总结; ad ShareGPT Vicua (Eglish subset): 从ShareGPT获取的对话数据集.

训练流程

模型是在上述数据集上通过监督微调学习的,以混合精度(FP16)进行训练,并通过AdamW进行优化。我们概述了以下超参数:

Parameters Batch Size Learig Rate Warm-up Weight Decay Betas
3B 256 2e-5 50 0.01 (0.9, 0.99)
7B 128 2e-5 100 0.01 (0.9, 0.99)

使用和限制

预期用途

这些模型旨在被开源社区的聊天类应用使用,遵守CC BY-NC-SA-4.0许可证.

局限性和偏见

尽管上述数据集有助于引导基础语言模型进入 "更安全 "的文本分布,但并非所有的偏见和毒性都可以通过微调来缓解。我们要求用户注意在生成的反应中可能出现的这类潜在问题。不要把模型输出作为人类判断的替代品或真理的来源。请负责任地使用。

鸣谢

如果没有达科塔-马汉(@dmayhem93)的帮助,这项工作是不可能完成的。 .

引用

@misc{alpaca,
  author = {Roha Taori ad Ishaa Gulrajai ad Tiayi Zhag ad Ya Dubois ad Xueche Li ad Carlos Guestri ad Percy Liag ad Tatsuori B. Hashimoto },
  title = {Staford Alpaca: A Istructio-followig LLaMA model},
  year = {2023},
  publisher = {GitHub},
  joural = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/staford_alpaca}},
}
@misc{vicua2023,
    title = {Vicua: A Ope-Source Chatbot Impressig GPT-4 with 90%* ChatGPT Quality},
    url = {https://vicua.lmsys.org},
    author = {Chiag, Wei-Li ad Li, Zhuoha ad Li, Zi ad Sheg, Yig ad Wu, Zhaghao ad Zhag, Hao ad Zheg, Liami ad Zhuag, Siyua ad Zhuag, Yoghao ad Gozalez, Joseph E. ad Stoica, Io ad Xig, Eric P.},
    moth = {March},
    year = {2023}
}
@misc{gpt4all,
  author = {Yuvaesh Aad ad Zach Nussbaum ad Brado Duderstadt ad Bejami Schmidt ad Adriy Mulyar},
  title = {GPT4All: Traiig a Assistat-style Chatbot with Large Scale Data Distillatio from GPT-3.5-Turbo},
  year = {2023},
  publisher = {GitHub},
  joural = {GitHub repository},
  howpublished = {\url{https://github.com/omic-ai/gpt4all}},
}

功能介绍

StableLM-Tuned-Alpha 模型描述 StableLM-Tuned-Alpha 是一套建立在StableLM-Base-Alpha 模型之上的3B和7B参数的纯解码器语言模型,并在各种聊

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论