Transformer-TTS:Pytorch实现的“基于变压器网络的神经语音合成”

上传者: 42099906 | 上传时间: 2022-07-14 15:19:35 | 文件大小: 1.51MB | 文件类型: ZIP
变压器-TTS Pytorch实现 与众所周知的saco2seq模型(如tacotron)相比,该模型的训练速度快约3至4倍,并且合成语音的质量几乎相同。 通过实验确认,每步花费约0.5秒。 我没有使用波网声码器,而是使用tacotron的CBHG模型学习了后网络,并使用griffin-lim算法将频谱图转换为原始波。 要求 安装python 3 安装pytorch == 0.4.0 安装要求: pip install -r requirements.txt 数据 我使用了LJSpeech数据集,该数据集由成对的文本脚本和wav文件组成。 完整的数据集(13,100对)可在下载。 我将和用作预处理代码。 预训练模型 您可以 下载预训练的模型(AR模型为160K,Postnet为100K) 在检查点/目录中找到预训练的模型。 注意图 约15k步后出现对角线对齐。 以下注意图以16

文件下载

资源详情

[{"title":"( 63 个子文件 1.51MB ) Transformer-TTS:Pytorch实现的“基于变压器网络的神经语音合成”","children":[{"title":"Transformer-TTS-master","children":[{"title":".gitignore <span style='color:#111;'> 119B </span>","children":null,"spread":false},{"title":"text","children":[{"title":"__init__.py <span style='color:#111;'> 2.15KB </span>","children":null,"spread":false},{"title":"symbols.py <span style='color:#111;'> 702B </span>","children":null,"spread":false},{"title":"cleaners.py <span style='color:#111;'> 2.36KB </span>","children":null,"spread":false},{"title":"numbers.py <span style='color:#111;'> 2.09KB </span>","children":null,"spread":false},{"title":"cmudict.py <span style='color:#111;'> 1.91KB </span>","children":null,"spread":false}],"spread":true},{"title":"requirements.txt <span style='color:#111;'> 106B </span>","children":null,"spread":false},{"title":"hyperparams.py <span style='color:#111;'> 742B </span>","children":null,"spread":false},{"title":"synthesis.py <span style='color:#111;'> 2.10KB </span>","children":null,"spread":false},{"title":"train_postnet.py <span style='color:#111;'> 2.00KB </span>","children":null,"spread":false},{"title":"network.py <span style='color:#111;'> 6.29KB </span>","children":null,"spread":false},{"title":"samples","children":[{"title":"test.wav <span style='color:#111;'> 428.67KB </span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE <span style='color:#111;'> 1.04KB </span>","children":null,"spread":false},{"title":"train_transformer.py <span style='color:#111;'> 3.79KB </span>","children":null,"spread":false},{"title":"utils.py <span style='color:#111;'> 4.12KB </span>","children":null,"spread":false},{"title":"png","children":[{"title":"mel_original.png <span style='color:#111;'> 54.67KB </span>","children":null,"spread":false},{"title":"attention_encoder","children":[{"title":"attention_enc_0_2.png <span style='color:#111;'> 395B </span>","children":null,"spread":false},{"title":"attention_enc_0_3.png <span style='color:#111;'> 529B </span>","children":null,"spread":false},{"title":"attention_enc_1_3.png <span style='color:#111;'> 5.88KB </span>","children":null,"spread":false},{"title":"attention_enc_1_0.png <span style='color:#111;'> 780B </span>","children":null,"spread":false},{"title":"attention_enc_0_0.png <span style='color:#111;'> 658B </span>","children":null,"spread":false},{"title":"attention_enc_2_1.png <span style='color:#111;'> 4.10KB </span>","children":null,"spread":false},{"title":"attention_enc_2_3.png <span style='color:#111;'> 1.19KB </span>","children":null,"spread":false},{"title":"attention_enc_2_0.png <span style='color:#111;'> 6.39KB </span>","children":null,"spread":false},{"title":"attention_enc_1_2.png <span style='color:#111;'> 6.13KB </span>","children":null,"spread":false},{"title":"attention_enc_2_2.png <span style='color:#111;'> 4.75KB </span>","children":null,"spread":false},{"title":"attention_enc_1_1.png <span style='color:#111;'> 6.49KB </span>","children":null,"spread":false},{"title":"attention_enc_0_1.png <span style='color:#111;'> 758B </span>","children":null,"spread":false}],"spread":false},{"title":"mel_pred.png <span style='color:#111;'> 49.92KB </span>","children":null,"spread":false},{"title":"attention","children":[{"title":"attention_0_2.png <span style='color:#111;'> 28.26KB </span>","children":null,"spread":false},{"title":"attention_1_0.png <span style='color:#111;'> 4.62KB </span>","children":null,"spread":false},{"title":"attention_0_0.png <span style='color:#111;'> 23.06KB </span>","children":null,"spread":false},{"title":"attention_0_3.png <span style='color:#111;'> 14.14KB </span>","children":null,"spread":false},{"title":"attention_2_1.png <span style='color:#111;'> 15.48KB </span>","children":null,"spread":false},{"title":"attention_2_3.png <span style='color:#111;'> 18.86KB </span>","children":null,"spread":false},{"title":"attention_0_1.png <span style='color:#111;'> 6.28KB </span>","children":null,"spread":false},{"title":"attention_2_2.png <span style='color:#111;'> 6.19KB </span>","children":null,"spread":false},{"title":"attention_1_2.png <span style='color:#111;'> 4.45KB </span>","children":null,"spread":false},{"title":"attention_1_1.png <span style='color:#111;'> 16.70KB </span>","children":null,"spread":false},{"title":"attention_1_3.png <span style='color:#111;'> 17.08KB </span>","children":null,"spread":false},{"title":"attention_2_0.png <span style='color:#111;'> 13.11KB </span>","children":null,"spread":false}],"spread":false},{"title":"training_loss.png <span style='color:#111;'> 113.36KB </span>","children":null,"spread":false},{"title":"attention_decoder","children":[{"title":"attention_dec_1_0.png <span style='color:#111;'> 14.64KB </span>","children":null,"spread":false},{"title":"attention_dec_0_1.png <span style='color:#111;'> 5.77KB </span>","children":null,"spread":false},{"title":"attention_dec_1_2.png <span style='color:#111;'> 19.01KB </span>","children":null,"spread":false},{"title":"attention_dec_0_3.png <span style='color:#111;'> 15.43KB </span>","children":null,"spread":false},{"title":"attention_dec_2_0.png <span style='color:#111;'> 16.10KB </span>","children":null,"spread":false},{"title":"attention_dec_0_2.png <span style='color:#111;'> 10.54KB </span>","children":null,"spread":false},{"title":"attention_dec_2_1.png <span style='color:#111;'> 17.53KB </span>","children":null,"spread":false},{"title":"attention_dec_1_3.png <span style='color:#111;'> 18.53KB </span>","children":null,"spread":false},{"title":"attention_dec_2_2.png <span style='color:#111;'> 16.63KB </span>","children":null,"spread":false},{"title":"attention_dec_0_0.png <span style='color:#111;'> 16.89KB </span>","children":null,"spread":false},{"title":"attention_dec_1_1.png <span style='color:#111;'> 15.70KB </span>","children":null,"spread":false},{"title":"attention_dec_2_3.png <span style='color:#111;'> 19.65KB </span>","children":null,"spread":false}],"spread":false},{"title":"model.png <span style='color:#111;'> 137.17KB </span>","children":null,"spread":false},{"title":"attention_encoder.gif <span style='color:#111;'> 33.84KB </span>","children":null,"spread":false},{"title":"attention.gif <span style='color:#111;'> 167.15KB </span>","children":null,"spread":false},{"title":"alphas.png <span style='color:#111;'> 98.99KB </span>","children":null,"spread":false},{"title":"attention_decoder.gif <span style='color:#111;'> 326.38KB </span>","children":null,"spread":false}],"spread":false},{"title":"README.md <span style='color:#111;'> 4.73KB </span>","children":null,"spread":false},{"title":"prepare_data.py <span style='color:#111;'> 1.32KB </span>","children":null,"spread":false},{"title":"module.py <span style='color:#111;'> 15.30KB </span>","children":null,"spread":false},{"title":"preprocess.py <span style='color:#111;'> 5.41KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明