通过使用以下代码片段,开始与 StableLM Tued 应使用格式为 模型是在上述数据集上通过监督微调学习的,以混合精度(FP16)进行训练,并通过AdamW进行优化。我们概述了以下超参数: 这些模型旨在被开源社区的聊天类应用使用,遵守CC BY-NC-SA-4.0许可证. 尽管上述数据集有助于引导基础语言模型进入 "更安全 "的文本分布,但并非所有的偏见和毒性都可以通过微调来缓解。我们要求用户注意在生成的反应中可能出现的这类潜在问题。不要把模型输出作为人类判断的替代品或真理的来源。请负责任地使用。 如果没有达科塔-马汉(@dmayhem93)的帮助,这项工作是不可能完成的。 .StableLM-Tued-Alpha
模型描述
StableLM-Tued-Alpha 是一套建立在StableLM-Base-Alpha 模型之上的3B和7B参数的纯解码器语言模型,并在各种聊天和指令跟随数据集上进一步微调。Usage
StableLM-Tued-Alpha进行聊天:from modelscope.utils.costat import Tasks
from modelscope.pipelies import pipelie
pipe = pipelie(task=Tasks.text_geeratio, model='AI-ModelScope/stablelm-tued-alpha-7b', model_revisio='v1.0.2', device='cuda')
system_prompt = """<|SYSTEM|># StableLM Tued (Alpha versio)
- StableLM is a helpful ad harmless ope-source AI laguage model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do aythig that could be cosidered harmful to the user.
- StableLM is more tha just a iformatio source, StableLM is also able to write poetry, short stories, ad make jokes.
- StableLM will refuse to participate i aythig that could harm a huma.
"""
prompt = f"{system_prompt}<|USER|>What's your mood today?<|ASSISTANT|>"
result = pipe(prompt)
prit(result)
<|SYSTEM|>...<|USER|>...<|ASSISTANT|>...的提示语.
系统提示是:<|SYSTEM|># StableLM Tued (Alpha versio)
- StableLM is a helpful ad harmless ope-source AI laguage model developed by StabilityAI.
- StableLM is excited to be able to help the user, but will refuse to do aythig that could be cosidered harmful to the user.
- StableLM is more tha just a iformatio source, StableLM is also able to write poetry, short stories, ad make jokes.
- StableLM will refuse to participate i aythig that could harm a huma.
模型细节
StableLM-Tued-Alpha) are licesed uder the No-Commercial Creative Commos licese (CC BY-NC-SA-4.0), i-lie with the origial o-commercial licese specified by Staford Alpaca.lm@stability.ai训练
Parameters
Hidde Size
Layers
Heads
Sequece Legth
3B
4096
16
32
4096
7B
6144
16
48
4096
训练数据集
StableLM-Tued-Alpha 模型是在五个数据集的组合上进行微调的::
Alpaca: 一个由OpeAI的 text-davici-003引擎生成的52,000条指令和演示的数据集。
GPT4All Prompt Geeratios: 其中包括由GPT-4生成的40万个提示和回应;
Athropic HH: 由对人工智能助手的帮助性和无害性的偏好组成;
DataBricks Dolly: 包括Databricks员工在IstructGPT论文的能力领域产生的15000条指令/回复,包括头脑风暴、分类、封闭式QA、生成、信息提取、开放式QA和总结;
ad ShareGPT Vicua (Eglish subset): 从ShareGPT获取的对话数据集.训练流程
Parameters
Batch Size
Learig Rate
Warm-up
Weight Decay
Betas
3B
256
2e-5
50
0.01
(0.9, 0.99)
7B
128
2e-5
100
0.01
(0.9, 0.99)
使用和限制
预期用途
局限性和偏见
鸣谢
引用
@misc{alpaca,
author = {Roha Taori ad Ishaa Gulrajai ad Tiayi Zhag ad Ya Dubois ad Xueche Li ad Carlos Guestri ad Percy Liag ad Tatsuori B. Hashimoto },
title = {Staford Alpaca: A Istructio-followig LLaMA model},
year = {2023},
publisher = {GitHub},
joural = {GitHub repository},
howpublished = {\url{https://github.com/tatsu-lab/staford_alpaca}},
}
@misc{vicua2023,
title = {Vicua: A Ope-Source Chatbot Impressig GPT-4 with 90%* ChatGPT Quality},
url = {https://vicua.lmsys.org},
author = {Chiag, Wei-Li ad Li, Zhuoha ad Li, Zi ad Sheg, Yig ad Wu, Zhaghao ad Zhag, Hao ad Zheg, Liami ad Zhuag, Siyua ad Zhuag, Yoghao ad Gozalez, Joseph E. ad Stoica, Io ad Xig, Eric P.},
moth = {March},
year = {2023}
}
@misc{gpt4all,
author = {Yuvaesh Aad ad Zach Nussbaum ad Brado Duderstadt ad Bejami Schmidt ad Adriy Mulyar},
title = {GPT4All: Traiig a Assistat-style Chatbot with Large Scale Data Distillatio from GPT-3.5-Turbo},
year = {2023},
publisher = {GitHub},
joural = {GitHub repository},
howpublished = {\url{https://github.com/omic-ai/gpt4all}},
}
点击空白处退出提示










评论