Amphion Singing Voice Conversion Pretrained Models

Quick Start

We provide a DiffWaveNetSVC pretrained checkpoint for you to play. Specially, it is trained under the real-world vocalist data (total duration: 6.16 hours), including the following 15 professional singers:

Singer	Language	Training Duration (mins)
David Tao 陶喆	Chinese	45.51
Eason Chan 陈奕迅	Chinese	43.36
Feng Wang 汪峰	Chinese	41.08
Jian Li 李健	Chinese	38.90
John Mayer	English	30.83
Adele	English	27.23
Ying Na 那英	Chinese	27.02
Yijie Shi 石倚洁	Chinese	24.93
Jacky Cheung 张学友	Chinese	18.31
Taylor Swift	English	18.31
Faye Wong 王菲	English	16.78
Michael Jackson	English	15.13
Tsai Chin 蔡琴	Chinese	10.12
Bruno Mars	English	6.29
Beyonce	English	6.06

To make these singers sing the songs you want to listen to, just run the following commands:

Step1: Download the checkpoint

git lfs install
git clone https://huggingface.co/amphion/singing_voice_conversion

Step2: Clone the Amphion's Source Code of GitHub

git clone https://github.com/open-mmlab/Amphion.git

Step3: Specify the checkpoint's path

Use the soft link to specify the downloaded checkpoint in first step:

cd Amphion
mkdir ckpts/svc
ln -s ../singing_voice_conversion/vocalist_l1_contentvec+whisper ckpts/svc/vocalist_l1_contentvec+whisper

Step4: Conversion

You can follow this recipe to conduct the conversion. For example, if you want to make Taylor Swift sing the songs in the [Your Audios Folder], just run:

sh egs/svc/MultipleContentsSVC/run.sh --stage 3 --gpu "0" \
    --config "ckpts/svc/vocalist_l1_contentvec+whisper/args.json" \
    --infer_expt_dir "ckpts/svc/vocalist_l1_contentvec+whisper" \
    --infer_output_dir "ckpts/svc/vocalist_l1_contentvec+whisper/result" \
    --infer_source_audio_dir [Your Audios Folder] \
    --infer_target_speaker "vocalist_l1_TaylorSwift" \
    --infer_key_shift "autoshift"

Note: The supported infer_target_speaker values can be seen here.

Citaions

@article{zhang2023leveraging,
  title={Leveraging Content-based Features from Multiple Acoustic Models for Singing Voice Conversion},
  author={Zhang, Xueyao and Gu, Yicheng and Chen, Haopeng and Fang, Zihao and Zou, Lexiao and Xue, Liumeng and Wu, Zhizheng},
  journal={Machine Learning for Audio Worshop, NeurIPS 2023},
  year={2023}
}

singing_voice_conversion

作品详情