Mixtral-8x7B-Instruct-v0.1-GGUF_开源AI项目-程序员客栈

这里是mistral的八专家的稀疏混合专家网络Mixtral的gguf文件. 46B的规模，13B的推理速度,你值得拥有！ Q3KM量化版本在m1 pro 32G上次测试ok

模型稍后上传

一、如何下载

# 首先安装git
# 再安装 lfs
git lfs install
GIT_LFS_SKIP_SMUDGE=1 git clone git clone https://www.modelscope.cn/limoncc/Mixtral-8x7B-Instruct-v0.1-GGUF.git
cd Mixtral-8x7B-Instruct-v0.1-GGUF
git lfs pull

如需自己转化量化版本，请下载夸克网盘分享的Mixtral-8x7B-Instruct-v0.1「ggml-model-f16.gguf」。
链接：https://pan.quark.cn/s/32f1ec958a74
提取码：hTBY

二、如何使用

关键要点：切换到mixtral分支。

git clone https://gitee.com/limoncc/llama.cpp.git
git checkout mixtral
mkdir build
cd build
cmake ..
cmake --build . --config Release

./bin/main -m ./models/mixtral-8x7b-32k/ggml-model-q4_0.gguf \
--temp 0.7 --repeat_penalty 1.1 -n -1 \
-p "I believe the meaning of life is" \

聊天格式

<s> 
[INST] Instruction [/INST]
 Model answer</s> 
[INST] Follow-up instruction [/INST]
 Model answer</s>

当然如果你对聊天格式还不熟悉可以查看我这篇文章:
不得不说的Chat Format(聊天格式)——大模型CPU部署系列03

如果你不知道什么是gguf可以参看下面的文章:
揭开gguf神秘面纱——大模型CPU部署系列02.

如果你还想了解llama.cpp的量化方法, 可以关注我的大模型CPU部署系列.
大模型CPU部署系列

Mixtral-8x7B-Instruct-v0.1-GGUF

作品详情

一、如何下载

二、如何使用

重点城市程序员兼职推荐

重点岗位程序员兼职推荐