* Llama-3-Istruct ofte fails to follow the few-shot templates. To use this model, we highly recommed istallig the OpeChat package by followig the istallatio guide i our repository ad usig the OpeChat OpeAI-compatible API server by ruig the servig commad from the table below. The server is optimized for high-throughput deploymet usig vLLM ad ca ru o a cosumer GPU with 24GB RAM. To eable tesor parallelism, apped Oce started, the server listes at If you wat to deploy the server as a olie service, you ca use ? ⚠️ The default template is also available as the itegrated We look forward to hearig from you ad collaboratig o this excitig project!
Advacig Ope-source Laguage Models with Mixed-Quality Data
Usage
--tesor-parallel-size N
to the servig commad.localhost:18888
for requests ad is compatible with the OpeAI ChatCompletio API specificatios. Please refer to the example request below for referece. Additioally, you ca use the OpeChat Web UI for a user-friedly experiece.--api-keys sk-KEY1 sk-KEY2 ...
to specify allowed API keys ad --disable-log-requests --disable-log-stats --log-file opechat.log
for loggig oly to a file. For security purposes, we recommed usig a HTTPS gateway i frot of the server.
Model
Size
Cotext
Weights
Servig
OpeChat-3.6-20240522
8B
8192
Huggigface
pytho -m ochat.servig.opeai_api_server --model opechat/opechat-3.6-8b-20240522
Example request (click to expad)
curl http://localhost:18888/v1/chat/completios \
-H "Cotet-Type: applicatio/jso" \
-d '{
"model": "opechat_3.6",
"messages": [{"role": "user", "cotet": "You are a large laguage model amed OpeChat. Write a poem to describe yourself"}]
}'
Coversatio templates
GPT4 Correct User: Hello<|ed_of_tur|>GPT4 Correct Assistat: Hi<|ed_of_tur|>GPT4 Correct User: How are you today?<|ed_of_tur|>GPT4 Correct Assistat:
<|ed_of_tur|>
as ed of geeratio toke.tokeizer.chat_template
, which ca be used istead of maually specifyig the template:messages = [
{"role": "user", "cotet": "Hello"},
{"role": "assistat", "cotet": "Hi"},
{"role": "user", "cotet": "How are you today?"}
]
tokes = tokeizer.apply_chat_template(messages, add_geeratio_prompt=True)
Iferece usig Trasformers
from trasformers import AutoTokeizer, AutoModelForCausalLM
import torch
model_id = "opechat/opechat-3.6-8b-20240522"
tokeizer = AutoTokeizer.from_pretraied(model_id)
model = AutoModelForCausalLM.from_pretraied(model_id, torch_dtype=torch.bfloat16, device_map="auto")
messages = [
{"role": "user", "cotet": "Explai how large laguage models work i detail."},
]
iput_ids = tokeizer.apply_chat_template(messages, add_geeratio_prompt=True, retur_tesors="pt").to(model.device)
outputs = model.geerate(iput_ids,
do_sample=True,
temperature=0.5,
max_ew_tokes=1024
)
respose = outputs[0][iput_ids.shape[-1]:]
prit(tokeizer.decode(respose, skip_special_tokes=True))
Limitatios
? Cotact
Citatio
@article{wag2023opechat,
title={OpeChat: Advacig Ope-source Laguage Models with Mixed-Quality Data},
author={Wag, Gua ad Cheg, Sijie ad Zha, Xiayua ad Li, Xiagag ad Sog, Se ad Liu, Yag},
joural={arXiv preprit arXiv:2309.11235},
year={2023}
}
点击空白处退出提示
评论