Korean-FastSpeech2-Pytorch:韩语FastSpeech2的实现

上传者: 42163404 | 上传时间: 2022-12-10 23:01:40 | 文件大小: 571KB | 文件类型: ZIP
韩国FastSpeech 2-Pytorch实施 介绍 随着基于深度学习的语音合成技术的最新发展,提出了一种非自回归语音合成模型,以提高自回归模型的慢速语音合成速度。 FastSpeech2是一种非自回归语音合成模型,它从蒙特利尔强制对齐器(M. McAuliffe等,2017)中提取通过提取音素(话音)对齐而获得的时长信息,并预测每个音素的时长。为此。 基于预测的持续时间来确定音素话语对准,并且基于该持续时间来生成与音素相对应的语音。 因此,要学习FastSpeech2,需要在MFA中学习的音素发音对齐信息。 该项目是Microsoft的实现,可在。 此源代码基于ming024的代码,并通过使用提取持续时间来实现。 该项目提供以下贡献。 使它适用于kss数据集的源代码 从蒙特利尔强制对齐器(TextGrid)提取的kss数据集的文本发音持续时间信息 在kss数据集上训练的FastS

文件下载

资源详情

[{"title":"( 39 个子文件 571KB ) Korean-FastSpeech2-Pytorch:韩语FastSpeech2的实现","children":[{"title":"Korean-FastSpeech2-Pytorch-master","children":[{"title":"fastspeech2.py <span style='color:#111;'> 2.19KB </span>","children":null,"spread":false},{"title":"train.py <span style='color:#111;'> 10.03KB </span>","children":null,"spread":false},{"title":"synthesize.py <span style='color:#111;'> 5.60KB </span>","children":null,"spread":false},{"title":"hparams.py <span style='color:#111;'> 1.70KB </span>","children":null,"spread":false},{"title":"text","children":[{"title":"__init__.py <span style='color:#111;'> 2.26KB </span>","children":null,"spread":false},{"title":"num.py <span style='color:#111;'> 2.07KB </span>","children":null,"spread":false},{"title":"cleaners.py <span style='color:#111;'> 2.66KB </span>","children":null,"spread":false},{"title":"korean.py <span style='color:#111;'> 767B </span>","children":null,"spread":false},{"title":"symbols.py <span style='color:#111;'> 439B </span>","children":null,"spread":false}],"spread":true},{"title":"utils.py <span style='color:#111;'> 7.22KB </span>","children":null,"spread":false},{"title":"loss.py <span style='color:#111;'> 1.50KB </span>","children":null,"spread":false},{"title":"requirements.txt <span style='color:#111;'> 1.78KB </span>","children":null,"spread":false},{"title":"preprocessed","children":[{"title":"kss","children":[{"title":"mel_stat.npy <span style='color:#111;'> 1.38KB </span>","children":null,"spread":false},{"title":"f0_stat.npy <span style='color:#111;'> 144B </span>","children":null,"spread":false},{"title":"energy_stat.npy <span style='color:#111;'> 144B </span>","children":null,"spread":false}],"spread":true}],"spread":true},{"title":"dataset.py <span style='color:#111;'> 4.79KB </span>","children":null,"spread":false},{"title":"optimizer.py <span style='color:#111;'> 1.00KB </span>","children":null,"spread":false},{"title":"modules.py <span style='color:#111;'> 5.91KB </span>","children":null,"spread":false},{"title":"LICENSE <span style='color:#111;'> 1.05KB </span>","children":null,"spread":false},{"title":"assets","children":[{"title":"model.png <span style='color:#111;'> 331.34KB </span>","children":null,"spread":false},{"title":"tensorboard.png <span style='color:#111;'> 70.78KB </span>","children":null,"spread":false},{"title":"melspectrogram.png <span style='color:#111;'> 158.79KB </span>","children":null,"spread":false}],"spread":false},{"title":"README.md <span style='color:#111;'> 6.55KB </span>","children":null,"spread":false},{"title":"vocoder","children":[{"title":"vocgan_generator.py <span style='color:#111;'> 9.25KB </span>","children":null,"spread":false}],"spread":false},{"title":"prepare_align.py <span style='color:#111;'> 205B </span>","children":null,"spread":false},{"title":"data","children":[{"title":"kss.py <span style='color:#111;'> 4.52KB </span>","children":null,"spread":false}],"spread":false},{"title":"transformer","children":[{"title":"Layers.py <span style='color:#111;'> 4.15KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 137B </span>","children":null,"spread":false},{"title":"Modules.py <span style='color:#111;'> 598B </span>","children":null,"spread":false},{"title":"Constants.py <span style='color:#111;'> 108B </span>","children":null,"spread":false},{"title":"Models.py <span style='color:#111;'> 4.65KB </span>","children":null,"spread":false},{"title":"SubLayers.py <span style='color:#111;'> 2.87KB </span>","children":null,"spread":false}],"spread":false},{"title":"evaluate.py <span style='color:#111;'> 7.82KB </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 1.83KB </span>","children":null,"spread":false},{"title":"preprocess.py <span style='color:#111;'> 2.33KB </span>","children":null,"spread":false},{"title":"audio","children":[{"title":"stft.py <span style='color:#111;'> 6.05KB </span>","children":null,"spread":false},{"title":"audio_processing.py <span style='color:#111;'> 2.61KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 67B </span>","children":null,"spread":false},{"title":"tools.py <span style='color:#111;'> 2.41KB </span>","children":null,"spread":false}],"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明