tacotronv2_wavernn_chinese:tacotronV2 + wavernn 实现中文语音合成(Tensorflow + pytorch)-源码

上传者: 42115513 | 上传时间: 2021-09-14 09:42:54 | 文件大小: 158.96MB | 文件类型: ZIP
TacotronV2 + WaveRNN update at 2020-10-3 添加微调分支 开源中文语音数据集(女声)训练中文,实现中文到声学特征(Mel)转换的声学模型。在GTA模式下,利用训练好的TacotronV2合成标贝语音数据集中中文对应的Mel特征,作为声码器的训练数据。在合成阶段,利用TactornV2和WaveRNN合成高质量、高自然度的中文语音。 从任选一个speaker的语音数据集,微调TacotronV2中的部分参数,实现说话人转换。 Tensorflow serving + Flask 部署TacotronV2中文语音合成服务。 由于中采用Location sensitive attention,对长句字的建模能力不好(漏读、重复),尝试了、、,能有效地解决对长句的建模能力,加快模型收敛速度。 tensorflow-gpu的版本为1.14.0 测试语音合成的效果

文件下载

资源详情

[{"title":"( 94 个子文件 158.96MB ) tacotronv2_wavernn_chinese:tacotronV2 + wavernn 实现中文语音合成(Tensorflow + pytorch)-源码","children":[{"title":"tacotronv2_wavernn_chinese-master","children":[{"title":"logs_wavernn","children":[{"title":"checkpoints","children":[{"title":"latest_optim.pyt <span style='color:#111;'> 36.21MB </span>","children":null,"spread":false},{"title":"latest_weights.pyt <span style='color:#111;'> 18.14MB </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":".gitignore <span style='color:#111;'> 1.88KB </span>","children":null,"spread":false},{"title":"images","children":[{"title":"website.png <span style='color:#111;'> 162.71KB </span>","children":null,"spread":false},{"title":"post_result.png <span style='color:#111;'> 102.00KB </span>","children":null,"spread":false}],"spread":true},{"title":"requirements.txt <span style='color:#111;'> 318B </span>","children":null,"spread":false},{"title":"wavernn_hparams.py <span style='color:#111;'> 2.49KB </span>","children":null,"spread":false},{"title":"website","children":[{"title":"app","children":[{"title":"plot.py <span style='color:#111;'> 684B </span>","children":null,"spread":false},{"title":"templates","children":[{"title":"index.html <span style='color:#111;'> 4.28KB </span>","children":null,"spread":false}],"spread":true},{"title":"__init__.py <span style='color:#111;'> 70B </span>","children":null,"spread":false},{"title":"views.py <span style='color:#111;'> 3.76KB </span>","children":null,"spread":false},{"title":"tensorflow_grpc.py <span style='color:#111;'> 1021B </span>","children":null,"spread":false},{"title":"text_to_pyin.py <span style='color:#111;'> 6.71KB </span>","children":null,"spread":false},{"title":"audio.py <span style='color:#111;'> 2.77KB </span>","children":null,"spread":false},{"title":"text.py <span style='color:#111;'> 2.39KB </span>","children":null,"spread":false}],"spread":true},{"title":"run.py <span style='color:#111;'> 71B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 1.11KB </span>","children":null,"spread":false}],"spread":true},{"title":"logs-Tacotron-2","children":[{"title":"taco_pretrained","children":[{"title":"tacotron_model.ckpt-206500.meta <span style='color:#111;'> 4.31MB </span>","children":null,"spread":false},{"title":"tacotron_model.ckpt-206500.index <span style='color:#111;'> 9.78KB </span>","children":null,"spread":false},{"title":"tacotron_model.ckpt-206500.data-00000-of-00001 <span style='color:#111;'> 59.09MB </span>","children":null,"spread":false},{"title":"checkpoint <span style='color:#111;'> 109B </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":".DS_Store <span style='color:#111;'> 6.00KB </span>","children":null,"spread":false},{"title":"tacotron_synthesize.py <span style='color:#111;'> 9.84KB </span>","children":null,"spread":false},{"title":"tacotron_hparams.py <span style='color:#111;'> 18.58KB </span>","children":null,"spread":false},{"title":"tacotron_preprocess.py <span style='color:#111;'> 4.52KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 8.12KB </span>","children":null,"spread":false},{"title":"tacotron_model_export.py <span style='color:#111;'> 2.30KB </span>","children":null,"spread":false},{"title":"demo","children":[{"title":"04-forward-griffin_lim-speaker-adaptive.wav <span style='color:#111;'> 18.27MB </span>","children":null,"spread":false},{"title":"04-graves-griffin_lim.wav <span style='color:#111;'> 20.68MB </span>","children":null,"spread":false},{"title":"03-forward-griffin_lim-speaker-adaptive.wav <span style='color:#111;'> 177.83KB </span>","children":null,"spread":false},{"title":"03-graves-griffin_lim.wav <span style='color:#111;'> 179.97KB </span>","children":null,"spread":false},{"title":"03-forward-wavernn.wav <span style='color:#111;'> 345.96KB </span>","children":null,"spread":false},{"title":"04-forward-griffin_lim.wav <span style='color:#111;'> 17.24MB </span>","children":null,"spread":false},{"title":"03-forward-griffin_lim.wav <span style='color:#111;'> 172.99KB </span>","children":null,"spread":false},{"title":"demo.html <span style='color:#111;'> 386B </span>","children":null,"spread":false},{"title":"05-forward-griffin_lim.wav <span style='color:#111;'> 441.55KB </span>","children":null,"spread":false},{"title":"01-graves-griffin_lim.wav <span style='color:#111;'> 710.64KB </span>","children":null,"spread":false},{"title":"02-forward-griffin_lim-speaker-adaptive.wav <span style='color:#111;'> 895.94KB </span>","children":null,"spread":false},{"title":"D8_766.mp3 <span style='color:#111;'> 31.75KB </span>","children":null,"spread":false},{"title":"02-graves-griffin_lim.wav <span style='color:#111;'> 1.00MB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 7.67KB </span>","children":null,"spread":false},{"title":"01-forward-griffin_lim-speaker-adaptive.wav <span style='color:#111;'> 669.82KB </span>","children":null,"spread":false},{"title":"05-forward-wavernn.wav <span style='color:#111;'> 883.06KB </span>","children":null,"spread":false},{"title":"01-forward-wavernn.wav <span style='color:#111;'> 1.29MB </span>","children":null,"spread":false},{"title":"02-forward-griffin_lim.wav <span style='color:#111;'> 845.45KB </span>","children":null,"spread":false},{"title":"02-forward-wavernn.wav <span style='color:#111;'> 1.65MB </span>","children":null,"spread":false},{"title":"01-forward-griffin_lim.wav <span style='color:#111;'> 662.84KB </span>","children":null,"spread":false}],"spread":false},{"title":"wavernn","children":[{"title":"models","children":[{"title":"fatchord_version.py <span style='color:#111;'> 14.94KB </span>","children":null,"spread":false},{"title":"deepmind_version.py <span style='color:#111;'> 7.18KB </span>","children":null,"spread":false}],"spread":false},{"title":"utils","children":[{"title":"checkpoints.py <span style='color:#111;'> 4.87KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 3.88KB </span>","children":null,"spread":false},{"title":"dsp.py <span style='color:#111;'> 2.69KB </span>","children":null,"spread":false},{"title":"distribution.py <span style='color:#111;'> 4.70KB </span>","children":null,"spread":false},{"title":"paths.py <span style='color:#111;'> 1.16KB </span>","children":null,"spread":false},{"title":"dataset.py <span style='color:#111;'> 4.05KB </span>","children":null,"spread":false},{"title":"display.py <span style='color:#111;'> 2.83KB </span>","children":null,"spread":false},{"title":"files.py <span style='color:#111;'> 224B </span>","children":null,"spread":false}],"spread":false}],"spread":false},{"title":"wavernn_train.py <span style='color:#111;'> 5.59KB </span>","children":null,"spread":false},{"title":"tacotron_train.py <span style='color:#111;'> 2.75KB </span>","children":null,"spread":false},{"title":"tacotron","children":[{"title":"datasets","children":[{"title":"preprocessor.py <span style='color:#111;'> 4.90KB </span>","children":null,"spread":false},{"title":"audio.py <span style='color:#111;'> 12.11KB </span>","children":null,"spread":false}],"spread":false},{"title":"models","children":[{"title":"tacotron_gmm.py <span style='color:#111;'> 14.77KB </span>","children":null,"spread":false},{"title":"graves_attention.py <span style='color:#111;'> 4.99KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 174B </span>","children":null,"spread":false},{"title":"tacotron.py <span style='color:#111;'> 15.83KB </span>","children":null,"spread":false},{"title":"gmm_attention.py <span style='color:#111;'> 2.63KB </span>","children":null,"spread":false},{"title":"location_sensitive_attention.py <span style='color:#111;'> 10.10KB </span>","children":null,"spread":false},{"title":"Architecture_wrappers_gmm.py <span style='color:#111;'> 9.59KB </span>","children":null,"spread":false},{"title":"custom_decoder.py <span style='color:#111;'> 4.78KB </span>","children":null,"spread":false},{"title":"Architecture_wrappers.py <span style='color:#111;'> 9.79KB </span>","children":null,"spread":false},{"title":"forward_attention.py <span style='color:#111;'> 10.99KB </span>","children":null,"spread":false},{"title":"attention.py <span style='color:#111;'> 11.04KB </span>","children":null,"spread":false},{"title":"modules.py <span style='color:#111;'> 18.02KB </span>","children":null,"spread":false},{"title":"helpers.py <span style='color:#111;'> 8.05KB </span>","children":null,"spread":false}],"spread":false},{"title":"synthesize.py <span style='color:#111;'> 5.29KB </span>","children":null,"spread":false},{"title":"utils","children":[{"title":"plot.py <span style='color:#111;'> 2.68KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 444B </span>","children":null,"spread":false},{"title":"symbols.py <span style='color:#111;'> 1.14KB </span>","children":null,"spread":false},{"title":"cleaners.py <span style='color:#111;'> 2.37KB </span>","children":null,"spread":false},{"title":"numbers.py <span style='color:#111;'> 2.07KB </span>","children":null,"spread":false},{"title":"cmudict.py <span style='color:#111;'> 1.88KB </span>","children":null,"spread":false},{"title":"text.py <span style='color:#111;'> 2.56KB </span>","children":null,"spread":false},{"title":"infolog.py <span style='color:#111;'> 1.23KB </span>","children":null,"spread":false}],"spread":false},{"title":"pinyin","children":[{"title":"parse_text_to_pyin.py <span style='color:#111;'> 6.73KB </span>","children":null,"spread":false},{"title":"pinyin.txt <span style='color:#111;'> 897.52KB </span>","children":null,"spread":false},{"title":"large_pinyin.txt <span style='color:#111;'> 8.37MB </span>","children":null,"spread":false}],"spread":false},{"title":"synthesizer.py <span style='color:#111;'> 6.08KB </span>","children":null,"spread":false},{"title":"feeder.py <span style='color:#111;'> 7.05KB </span>","children":null,"spread":false},{"title":"train.py <span style='color:#111;'> 10.57KB </span>","children":null,"spread":false}],"spread":false},{"title":"wavernn_gen.py <span style='color:#111;'> 5.16KB </span>","children":null,"spread":false},{"title":"wavernn_preprocess.py <span style='color:#111;'> 6.39KB </span>","children":null,"spread":false},{"title":"index.html <span style='color:#111;'> 13.65KB </span>","children":null,"spread":false},{"title":"train.txt <span style='color:#111;'> 1.93MB </span>","children":null,"spread":false},{"title":"read_checkpoint.py <span style='color:#111;'> 531B </span>","children":null,"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明