开源地址
https://modelscope.cn/models/AI-ModelScope/Mixtral-8x7B-v0.1授权协议
apache-2.0

Model Card for Mixtral-8x7B

The Mixtral-8x7B Large Laguage Model (LLM) is a pretraied geerative Sparse Mixture of Experts. The Mistral-8x7B outperforms Llama 2 70B o most bechmarks we tested.

For full details of this model please read our release blog post.

Warig

This repo cotais weights that are compatible with vLLM servig of the model as well as Huggig Face trasformers library. It is based o the origial Mixtral torret release, but the file format ad parameter ames are differet. Please ote that model caot (yet) be istatiated with HF.

Ru the model

from modelscope import AutoModelForCausalLM, AutoTokeizer

model_id = "AI-ModelScope/Mixtral-8x7B-v0.1"
tokeizer = AutoTokeizer.from_pretraied(model_id)

model = AutoModelForCausalLM.from_pretraied(model_id, device_map='auto')

text = "Hello my ame is"
iputs = tokeizer(text, retur_tesors="pt")

outputs = model.geerate(**iputs, max_ew_tokes=20)
prit(tokeizer.decode(outputs[0], skip_special_tokes=True))

By default, trasformers will load the model i full precisio. Therefore you might be iterested to further reduce dow the memory requiremets to ru the model through the optimizatios we offer i HF ecosystem:

I half-precisio

Note float16 precisio oly works o GPU devices

Click to expad

+ import torch
from trasformers import AutoModelForCausalLM, AutoTokeizer

model_id = "mistralai/Mixtral-8x7B-v0.1"
tokeizer = AutoTokeizer.from_pretraied(model_id)

+ model = AutoModelForCausalLM.from_pretraied(model_id, torch_dtype=torch.float16).to(0)

text = "Hello my ame is"
+ iputs = tokeizer(text, retur_tesors="pt").to(0)

outputs = model.geerate(**iputs, max_ew_tokes=20)
prit(tokeizer.decode(outputs[0], skip_special_tokes=True))

Lower precisio usig (8-bit & 4-bit) usig `bitsadbytes`

Click to expad

+ import torch
from trasformers import AutoModelForCausalLM, AutoTokeizer

model_id = "mistralai/Mixtral-8x7B-v0.1"
tokeizer = AutoTokeizer.from_pretraied(model_id)

+ model = AutoModelForCausalLM.from_pretraied(model_id, load_i_4bit=True)

text = "Hello my ame is"
+ iputs = tokeizer(text, retur_tesors="pt").to(0)

outputs = model.geerate(**iputs, max_ew_tokes=20)
prit(tokeizer.decode(outputs[0], skip_special_tokes=True))

Load the model with Flash Attetio 2

Click to expad

+ import torch
from trasformers import AutoModelForCausalLM, AutoTokeizer

model_id = "mistralai/Mixtral-8x7B-v0.1"
tokeizer = AutoTokeizer.from_pretraied(model_id)

+ model = AutoModelForCausalLM.from_pretraied(model_id, use_flash_attetio_2=True)

text = "Hello my ame is"
+ iputs = tokeizer(text, retur_tesors="pt").to(0)

outputs = model.geerate(**iputs, max_ew_tokes=20)
prit(tokeizer.decode(outputs[0], skip_special_tokes=True))

Notice

Mixtral-8x7B is a pretraied base model ad therefore does ot have ay moderatio mechaisms.

The Mistral AI Team

Albert Jiag, Alexadre Sablayrolles, Arthur Mesch, Blache Savary, Chris Bamford, Devedra Sigh Chaplot, Diego de las Casas, Emma Bou Haa, Floria Bressad, Giaa Legyel, Guillaume Bour, Guillaume Lample, Lélio Reard Lavaud, Louis Tero, Lucile Saulier, Marie-Ae Lachaux, Pierre Stock, Teve Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wag, Timothée Lacroix, William El Sayed.

Model Card for Mixtral-8x7B The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative S

声明：本文仅代表作者观点，不代表本站立场。如果侵犯到您的合法权益，请联系我们删除侵权资源！如果遇到资源链接失效，请您通过评论或工单的方式通知管理员。未经允许，不得转载，本站所有资源文章禁止商业使用运营!

下载安装【程序员客栈】APP

实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

前往安装

Mixtral-8x7B-v0.1

技术信息

作品详情

Model Card for Mixtral-8x7B

Warig

Ru the model

I half-precisio

Lower precisio usig (8-bit & 4-bit) usig `bitsadbytes`

Load the model with Flash Attetio 2

Notice

The Mistral AI Team

功能介绍

重点城市程序员兼职推荐

重点岗位程序员兼职推荐

Mixtral-8x7B-v0.1

技术信息

作品详情

Model Card for Mixtral-8x7B

Warig

Ru the model

I half-precisio

Lower precisio usig (8-bit & 4-bit) usig bitsadbytes

Load the model with Flash Attetio 2

Notice

The Mistral AI Team

功能介绍

重点城市程序员兼职推荐

重点岗位程序员兼职推荐

Lower precisio usig (8-bit & 4-bit) usig `bitsadbytes`