Paraformer语音识别-中文-通用-16k-离线-轻量

我要开发同款
匿名用户2024年07月31日
39阅读
所属分类aipytorch
开源地址https://modelscope.cn/models/crazyant/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-onnx

作品详情

Highlights

模型为Paraformer语音识别-中文-通用-16k-离线的onnx量化导出版本,可以直接用来做生产部署,一键部署教程(点击此处

模型转换及测试脚本

测试数据:https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/testaudio/asrexample_zh.pcm

from funasr_onnx import Paraformer
from pathlib import Path

model_dir = "damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1"
model = Paraformer(model_dir, batch_size=1, quantize=True)

wav_path = path_to_asr_example_zh

result = model(wav_path)
print(result)
  • model_dir: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain model.onnx, config.yaml, am.mvn
  • batch_size: 1 (Default), the batch size duration inference
  • device_id: -1 (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
  • quantize: False (Default), load the model of model.onnx in model_dir. If set True, load the model of model_quant.onnx in model_dir
  • intra_op_num_threads: 4 (Default), sets the number of threads used for intraop parallelism on CPU

参考教程:https://alibaba-damo-academy.github.io/FunASR/en/runtime/python/onnxruntime/README.html

相关论文以及引用信息

@inproceedings{gao2022paraformer,
  title={Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition},
  author={Gao, Zhifu and Zhang, Shiliang and McLoughlin, Ian and Yan, Zhijie},
  booktitle={INTERSPEECH},
  year={2022}
}
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论