singing_voice_conversion

我要开发同款
匿名用户2024年07月31日
20阅读
所属分类ai
开源地址https://modelscope.cn/models/AI-ModelScope/singing_voice_conversion
授权协议mit

作品详情

Amphion Singing Voice Conversion Pretrained Models

Quick Start

We provide a DiffWaveNetSVC pretrained checkpoint for you to play. Specially, it is trained under the real-world vocalist data (total duration: 6.16 hours), including the following 15 professional singers:

Singer Language Training Duration (mins)
David Tao 陶喆 Chinese 45.51
Eason Chan 陈奕迅 Chinese 43.36
Feng Wang 汪峰 Chinese 41.08
Jian Li 李健 Chinese 38.90
John Mayer English 30.83
Adele English 27.23
Ying Na 那英 Chinese 27.02
Yijie Shi 石倚洁 Chinese 24.93
Jacky Cheung 张学友 Chinese 18.31
Taylor Swift English 18.31
Faye Wong 王菲 English 16.78
Michael Jackson English 15.13
Tsai Chin 蔡琴 Chinese 10.12
Bruno Mars English 6.29
Beyonce English 6.06

To make these singers sing the songs you want to listen to, just run the following commands:

Step1: Download the checkpoint

git lfs install
git clone https://huggingface.co/amphion/singing_voice_conversion

Step2: Clone the Amphion's Source Code of GitHub

git clone https://github.com/open-mmlab/Amphion.git

Step3: Specify the checkpoint's path

Use the soft link to specify the downloaded checkpoint in first step:

cd Amphion
mkdir ckpts/svc
ln -s ../singing_voice_conversion/vocalist_l1_contentvec+whisper ckpts/svc/vocalist_l1_contentvec+whisper

Step4: Conversion

You can follow this recipe to conduct the conversion. For example, if you want to make Taylor Swift sing the songs in the [Your Audios Folder], just run:

sh egs/svc/MultipleContentsSVC/run.sh --stage 3 --gpu "0" \
    --config "ckpts/svc/vocalist_l1_contentvec+whisper/args.json" \
    --infer_expt_dir "ckpts/svc/vocalist_l1_contentvec+whisper" \
    --infer_output_dir "ckpts/svc/vocalist_l1_contentvec+whisper/result" \
    --infer_source_audio_dir [Your Audios Folder] \
    --infer_target_speaker "vocalist_l1_TaylorSwift" \
    --infer_key_shift "autoshift"

Note: The supported infer_target_speaker values can be seen here.

Citaions

@article{zhang2023leveraging,
  title={Leveraging Content-based Features from Multiple Acoustic Models for Singing Voice Conversion},
  author={Zhang, Xueyao and Gu, Yicheng and Chen, Haopeng and Fang, Zihao and Zou, Lexiao and Xue, Liumeng and Wu, Zhizheng},
  journal={Machine Learning for Audio Worshop, NeurIPS 2023},
  year={2023}
}
声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论