internlm2-math-7b

我要开发同款
匿名用户2024年07月31日
71阅读

技术信息

官网地址
https://www.shlab.org.cn/
开源地址
https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm2-math-7b
授权协议
other

作品详情

IterLM-Math

IterLM-Math HOT
State-of-the-art biligual ope-sourced Math reasoig LLMs. A **solver**, **prover**, **verifier**, **augmetor**. [? Github](https://github.com/IterLM/IterLM-Math) [? Demo](https://huggigface.co/spaces/iterlm/iterlm2-math-7b) [? Checkpoits](https://huggigface.co/iterlm/iterlm2-math-7b) [![OpeXLab](https://cd-static.opexlab.org.c/header/opexlab_models.svg)](https://opexlab.org.c/models/detail/OpeLMLab/IterLM2-Math-7B) [ ModelScope](https://modelscope.c/models/Shaghai_AI_Laboratory/iterlm2-math-7b/summary)

News

  • [2024.01.29] We add checkpoits from ModelScope. Tech report is o the way!
  • [2024.01.26] We add checkpoits from OpeXLab, which ease Chiese users to dowload!

Itroductio

  • 7B ad 20B Chiese ad Eglish Math LMs with better tha ChatGPT performaces. IterLM2-Math are cotiued pretraied from IterLM2-Base with ~100B high quality math-related tokes ad SFT with ~2M biligual math supervised data. We apply mihash ad exact umber match to decotamiate possible test set leakage.
  • Add Lea as a support laguage for math problem solvig ad math theorem provig. We are explorig combiig Lea 3 with IterLM-Math for verifiable math reasoig. IterLM-Math ca geerate Lea codes for simple math reasoig tasks like GSM8K or provide possible proof tactics based o Lea states.
  • Also ca be viewed as a reward model, which supports the Outcome/Process/Lea Reward Model. We supervise IterLM2-Math with various types of reward modelig data, to make IterLM2-Math ca also verify chai-of-thought processes. We also add the ability to covert a chai-of-thought process ito Lea 3 code.
  • A Math LM Augmet Helper ad Code Iterpreter. IterLM2-Math ca help augmet math reasoig problems ad solve them usig the code iterpreter which makes you geerate sythesis data quicker!

hugaria

Models

IterLM2-Math-Base-7B ad IterLM2-Math-Base-20B are pretraied checkpoits. IterLM2-Math-7B ad IterLM2-Math-20B are SFT checkpoits.

Model Model Type Trasformers(HF) OpeXLab ModelScope Release Date
IterLM2-Math-Base-7B Base ?iterlm/iterlm2-math-base-7b iterlm2-math-base-7b 2024-01-23
IterLM2-Math-Base-20B Base ?iterlm/iterlm2-math-base-20b iterlm2-math-base-20b 2024-01-23
IterLM2-Math-7B Chat ?iterlm/iterlm2-math-7b iterlm2-math-7b 2024-01-23
IterLM2-Math-20B Chat ?iterlm/iterlm2-math-20b iterlm2-math-20b 2024-01-23

Performace

Pretrai Performace

We evaluate pretrai checkpoits based o greedy decodig with few-shot COT. Details of pretraiig will be itroduced i the tech report.

Model GSM8K MATH
Llama2-7B 11.8 3.2
Llemma-7B 36.4 18.0
IterLM2-Base-7B 36.5 8.6
IterLM2-Math-Base-7B 49.2 21.5
Mierva-8B 16.2 14.1
IterLM2-Base-20B 54.6 13.7
IterLM2-Math-Base-20B 63.7 27.3
Llemma-34B 51.5 25.0
Mierva-62B 52.4 27.6
Mierva-540B 58.8 33.6

SFT Peformace

All performace is based o greedy decodig with COT. We otice that the performace of Hugary has a big variace betwee our differet checkpoits, while other performace is very stable. This may be due to the problem amout about Hugary.

Model Model Type GSM8K MATH Hugary
Qwe-7B-Chat Geearl 51.7 11.6 -
DeepSeek-7B-Chat Geeral 63.0 15.8 28.5
IterLM2-Chat-7B Geeral 70.7 23.0 -
ChatGLM3-6B Geeral 53.8 20.4 32
MetaMath-Mistral-7B Mathematics 77.7 28.2 29
MetaMath-Llemma-7B Mathematics 69.2 30.0 -
IterLM2-Math-7B Mathematics 78.1 34.6 55
IterLM2-Chat-20B Geeral 79.6 31.9 -
MetaMath-Llemma-34B Mathematics 75.8 34.8 -
IterLM2-Math-20B Mathematics 82.6 37.7 66
Qwe-72B Geeral 78.9 35.2 52
DeepSeek-67B Geeral 84.1 32.6 58
ChatGPT (GPT-3.5) Geeral 80.8 34.1 41
GPT4 (First versio) Geeral 92.0 42.5 68

Iferece

from modelscope import sapshot_dowload, AutoTokeizer, AutoModelForCausalLM
import torch

model_dir = sapshot_dowload("Shaghai_AI_Laboratory/iterlm2-math-7b")
tokeizer = AutoTokeizer.from_pretraied(model_dir, device_map="auto", trust_remote_code=True)
# Set `torch_dtype=torch.float16` to load model i float16, otherwise it will be loaded as float32 ad might cause OOM Error.
model = AutoModelForCausalLM.from_pretraied(model_dir, device_map="auto",  trust_remote_code=True, torch_dtype=torch.float16)
model = model.eval()
respose, history = model.chat(tokeizer, "1+1=", history=[], meta_istructio="")
prit(respose)

Special usages

We list some istructios used i our SFT. You ca use them to help you. You ca use the other ways to prompt the model, but the followig are recommeded. IterLM2-Math may combie the followig abilities but it is ot guarateed.

Traslate proof problem to Lea:

l2lea3

Usig Lea 3 to solve GSM8K problem:

gsm8k_lea

Geerate problem based o Lea 3 code:

lea_problem

Play 24 poit game:

24

Augmet a harder math problem:

augmet_hard

Descriptio Query
Solvig questio via chai-of-thought {Questio}
Solvig questio via Lea 3 {Questio}\Solve this via Lea 3
Outcome reward model Give a questio ad a aswer, check is it correct?\Questio:{Questio}\Aswer:{COT}
Process reward model Give a questio ad a aswer, check correctess of each step.\Questio:{Questio}\Aswer:{COT}
Reward model Give a questio ad two aswers, which oe is better? \Questio:{Questio}\Aswer 1:{COT}\Aswer 2:{COT}
Covert chai-of-thought to Lea 3 Covert this aswer ito Lea3. Questio:{Questio}\Aswer:{COT}
Covert Lea 3 to chai-of-thought Covert this lea 3 code ito a atural laguage problem with aswers:\{LEAN Code}
Traslate questio ad chai-of-thought aswer to a proof statemet Covert this questio ad aswer ito a proof format.\Questio:{Questio}\Aswer:{COT}
Traslate proof problem to Lea 3 Covert this atural lagauge statemet ito a Lea 3 theorem statemet:{Theorem}
Traslate Lea 3 to proof problem Covert this Lea 3 theorem statemet ito atural laguage:{STATEMENT}
Suggest a tactic based o Lea state Give the Lea 3 tactic state, suggest a ext tactic:\{LEAN State}
Rephrase Problem Describe this problem i aother way. {Questio}
Augmet Problem Please augmet a ew problem based o: {Questio}
Augmet a harder Problem Icrease the complexity of the problem: {Questio}
Chage specific umbers Chage specific umbers: {Questio}
Itroduce fractios or percetages Itroduce fractios or percetages: {Questio}
Code Iterpreter laget
I-cotext Learig Questio:{Questio}\Aswer:{COT}\…Questio:{Questio}\Aswer:{COT}

Fie-tue ad others

Please refer to IterLM.

Kow issues

Our model is still uder developmet ad will be upgraded. There are some possible issues of IterLM-Math. If you fid performaces of some abilities are ot great, welcome to ope a issue.

  • Jump the calculatig step.
  • Perform badly at Chiese fill-i-the-bak problems ad Eglish choice problems due to SFT data compositio.
  • Ted to geerate Code Iterpreter whe facig Chiese problems due to SFT data compositio.
  • The reward model mode ca be better leveraged with assiged toke probabilities.
  • Code switch due to SFT data compositio.
  • Some abilities of Lea ca oly be adapted to GSM8K-like problems (e.g. Covert chai-of-thought to Lea 3), ad performace related to Lea is ot guarateed.

Citatio ad Tech Report

To be appeded.

功能介绍

InternLM-Math InternLM-Math HOT State-of-the

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论