MetaMath-Mistral-Pro

我要开发同款
匿名用户2024年07月31日
29阅读
所属分类ai
开源地址https://modelscope.cn/models/AI-ModelScope/MetaMath-Mistral-Pro
授权协议Apache License 2.0

作品详情

see our paper in https://arxiv.org/abs/2401.02415

View the project page: https://github.com/TencentARC/LLaMA-Pro

Model Details

MetaMath-Mistral-Pro is fully fine-tuned on the MetaMathQA datasets and based on the powerful Mistral-Pro model.

Model Usage

The model is trained to use the following format (note the newlines):

<|user|>
Your message here!
<|assistant|>

For best results, format all inputs in this manner. Make sure to include a newline after <|assistant|>, this can affect generation quality quite a bit.

Experiments

Model GSM8k Pass@1 MATH Pass@1
MPT-7B 6.8 3.0
Falcon-7B 6.8 2.3
LLaMA-1-7B 11.0 2.9
LLaMA-2-7B 14.6 2.5
MPT-30B 15.2 3.1
LLaMA-1-13B 17.8 3.9
GPT-Neo-2.7B 19.5 --
Falcon-40B 19.6 2.5
Baichuan-chat-13B 23.9 --
Vicuna-v1.3-13B 27.6 --
LLaMA-2-13B 28.7 3.9
InternLM-7B 31.2 --
ChatGLM-2-6B 32.4 --
GPT-J-6B 34.9 --
LLaMA-1-33B 35.6 3.9
LLaMA-2-34B 42.2 6.24
RFT-7B 50.3 --
LLaMA-1-65B 50.9 10.6
Qwen-7B 51.6 --
WizardMath-7B 54.9 10.7
LLaMA-2-70B 56.8 13.5
WizardMath-13B 63.9 14.0
MAmmoTH-7B (COT) 50.5 10.4
MAmmoTH-7B (POT+COT) 53.6 31.5
Arithmo-Mistral-7B 74.7 25.3
MetaMath-7B 66.5 19.8
MetaMath-13B 72.3 22.4
MetaMath-Mistral-7B 77.7 28.2
MetaMath-Llemma-7B 69.2 30.0
? MetaMath-Mistral-Pro 78.4 30.3

Citation

@article{wu2024llama,
  title={Llama pro: Progressive llama with block expansion},
  author={Wu, Chengyue and Gan, Yukang and Ge, Yixiao and Lu, Zeyu and Wang, Jiahao and Feng, Ye and Luo, Ping and Shan, Ying},
  journal={arXiv preprint arXiv:2401.02415},
  year={2024}
}
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论