math-codet5p-770m-py

我要开发同款
匿名用户2024年07月31日
33阅读
所属分类ai、t5、pytorch、accuracy、text2text-generation
开源地址https://modelscope.cn/models/zhuxunyu/math-codet5p-770m-py
授权协议Apache License 2.0

作品详情

Model Card for Model ID

We distill math reasoning ability from large language model gpt-3.5-turbo to the open code small language model Salesforce/codet5p-770m-py, and math-codet5p-770m-py achieves 44.88% accuracy on GSM8K testing dataset.

Model Description

  • Developed by: Xunyu Zhu
  • Model type: encoder-decoder
  • Language(s) (NLP): python
  • License: apache-2.0
  • Finetuned from model: Salesforce/codet5p-770m-py

Uses

Direct Use

This model can be easily loaded using the AutoModelForSeq2SeqLM functionality and employs the same tokenizer as original Salesforce/codet5p-770m-py. When given a question, the prompt "\nProgram: Let’s design executable python program (return ans) to solve the question." is needed to add as the input to instruct the model to generate reasoning results.

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

def safe_execute(code_string: str, keys=None):
    def execute(x):
        try:
            exec(x)
            locals_ = locals()
            if keys is None:
                return locals_.get('ans', None)
            else:
                return [locals_.get(k, None) for k in keys]
        except Exception:
            return None
    try:
        ans = func_timeout.func_timeout(5, execute, args=(code_string,))
    except func_timeout.FunctionTimedOut:
        ans = None
    return ans

checkpoint = "zhuxunyu/math-codet5p-770m-py"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint).to(device)

question = "Question: Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?\nProgram: Let’s design executable python program (return ans) to solve the question.".
input = tokenizer(question, max_length=256, padding="max_length", truncation=True, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(**input, max_length=256)

generation = tokenizer.decode(output, skip_special_tokens=True)
ans = safe_execute(generation)
print(float(ans))

Training Details

Training Data

We prompt gpt-3.5-turbo to generate reasoning programs to solve questions in GSM8K training dataset, and each question includes 4 reasoning programs. Then, questions in GSM8K training dataset and their corresponding reasoning programs are built as a training dataset, and we use the training dataset to fine-tune the LM.

Evaluation

Testing Data

The testing data is GSM8K testing dataset.

Results

math-codet5p-770m-py achieves 44.88% accuracy on GSM8K testing dataset.

Citation

BibTeX:

@misc{zhu2023mathcodet5plus,
  title={math-codet5p-770m-py},
  author={Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang},
  year={2023}
}
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论