提取均值信号特征的matlab代码-PIT-LSTM-Speech-Separation:用于语音分离的PIT的张量流实现

上传者: 38735119 | 上传时间: 2022-12-21 11:33:49 | 文件大小: 5.37MB | 文件类型: ZIP
提取均值信号特征的matlab代码两个扬声器的基于 LSTM/BLSTM 的 PIT 在多通话者混合语音分离和识别方面取得的进展,通常被称为“鸡尾酒会问题”,并没有那么令人印象深刻。 尽管人类听众可以很容易地感知混合声音中的不同来源,但对于计算机来说,同样的任务似乎极其困难,尤其是当只有一个麦克风记录混合语音时。 1. 运行性能 注意:训练集和验证集包含通过从 WSJ0 集中随机选择说话者和话语生成的两个说话者混合,并以 -2.5 dB 和 2.5 dB 之间统一选择的各种信噪比 (SNR) 混合它们. 对于LSTM ,不同性别的混合音频结果如下: 对于BLSTM ,不同性别的混合音频结果如下: 从上面的结果可以看出,混合性别音频的分离效果优于同性音频,BLSTM 的性能优于 LSTM。 2. 评价标准 SDR:信号失真比 SAR:信号与伪像的比率 SIR:信号干扰比 STOI:短期客观可懂度测量 ESTOI:扩展的短期目标可懂度测量 PESQ:语音质量的感知评估 3. 依赖库 matlab(我的测试版:R2016b 64位) tensorflow(我的测试版本:1.4.0) anac

文件下载

资源详情

[{"title":"( 63 个子文件 5.37MB ) 提取均值信号特征的matlab代码-PIT-LSTM-Speech-Separation:用于语音分离的PIT的张量流实现","children":[{"title":"PIT-LSTM-Speech-Separation-master","children":[{"title":"run_lstm.py <span style='color:#111;'> 16.23KB </span>","children":null,"spread":false},{"title":"tfrecords_io.py <span style='color:#111;'> 5.06KB </span>","children":null,"spread":false},{"title":"signal_processing.py <span style='color:#111;'> 8.02KB </span>","children":null,"spread":false},{"title":"gen_tfrecords.py <span style='color:#111;'> 4.20KB </span>","children":null,"spread":false},{"title":"make_wav_list.py <span style='color:#111;'> 1015B </span>","children":null,"spread":false},{"title":"wsj0-train-spkrinfo.txt <span style='color:#111;'> 864B </span>","children":null,"spread":false},{"title":"6. separated_result_LSTM","children":[{"title":"two_women_2.wav <span style='color:#111;'> 83.79KB </span>","children":null,"spread":false},{"title":"two_women_1.wav <span style='color:#111;'> 83.79KB </span>","children":null,"spread":false},{"title":"two_men_2.wav <span style='color:#111;'> 96.79KB </span>","children":null,"spread":false},{"title":"one_man_one_woman_2.wav <span style='color:#111;'> 86.79KB </span>","children":null,"spread":false},{"title":"one_man_one_woman_1.wav <span style='color:#111;'> 86.79KB </span>","children":null,"spread":false},{"title":"two_men_1.wav <span style='color:#111;'> 96.79KB </span>","children":null,"spread":false}],"spread":true},{"title":"4. introduction_to_mask","children":[{"title":"masks.png <span style='color:#111;'> 21.02KB </span>","children":null,"spread":false},{"title":"SA2.wav <span style='color:#111;'> 95.24KB </span>","children":null,"spread":false},{"title":"SA1.wav <span style='color:#111;'> 104.64KB </span>","children":null,"spread":false},{"title":"recoverd2.png <span style='color:#111;'> 19.83KB </span>","children":null,"spread":false},{"title":"Introduction to Ideal Binary Mask.ipynb <span style='color:#111;'> 185.85KB </span>","children":null,"spread":false},{"title":"recoverd1.png <span style='color:#111;'> 18.90KB </span>","children":null,"spread":false},{"title":"mixed.wav <span style='color:#111;'> 125.04KB </span>","children":null,"spread":false},{"title":"MPM14-Time-Frequency-Masking.pdf <span style='color:#111;'> 1.08MB </span>","children":null,"spread":false},{"title":"mixturesignals.png <span style='color:#111;'> 24.97KB </span>","children":null,"spread":false},{"title":"spectrograms.png <span style='color:#111;'> 52.45KB </span>","children":null,"spread":false}],"spread":true},{"title":"utils.py <span style='color:#111;'> 4.14KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 5.99KB </span>","children":null,"spread":false},{"title":"blstm.py <span style='color:#111;'> 8.89KB </span>","children":null,"spread":false},{"title":"run.sh <span style='color:#111;'> 5.23KB </span>","children":null,"spread":false},{"title":"2. create-speaker-mixtures-V2","children":[{"title":"mix_2_spk_tt.txt <span style='color:#111;'> 230.49KB </span>","children":null,"spread":false},{"title":"mix_3_spk_cv.txt <span style='color:#111;'> 530.56KB </span>","children":null,"spread":false},{"title":"mix_2_spk_cv.txt <span style='color:#111;'> 374.36KB </span>","children":null,"spread":false},{"title":"mix_3_spk_tt.txt <span style='color:#111;'> 327.25KB </span>","children":null,"spread":false},{"title":"create_wav_2speakers.m <span style='color:#111;'> 8.95KB </span>","children":null,"spread":false},{"title":"activlev.m <span style='color:#111;'> 16.29KB </span>","children":null,"spread":false},{"title":"maxfilt.m <span style='color:#111;'> 4.70KB </span>","children":null,"spread":false},{"title":"create_wav_3speakers.m <span style='color:#111;'> 9.25KB </span>","children":null,"spread":false},{"title":"mix_2_spk_tr.txt <span style='color:#111;'> 1.46MB </span>","children":null,"spread":false},{"title":"readme.txt <span style='color:#111;'> 781B </span>","children":null,"spread":false},{"title":"mix_3_spk_tr.txt <span style='color:#111;'> 2.07MB </span>","children":null,"spread":false}],"spread":false},{"title":"3. SPHFile2Wav","children":[{"title":"SA1.WAV <span style='color:#111;'> 110.20KB </span>","children":null,"spread":false},{"title":"SPH2Wav.py <span style='color:#111;'> 435B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 366B </span>","children":null,"spread":false},{"title":"converted.wav <span style='color:#111;'> 109.24KB </span>","children":null,"spread":false}],"spread":true},{"title":"8.Result picture","children":[{"title":"BLSTM-result.png <span style='color:#111;'> 47.89KB </span>","children":null,"spread":false},{"title":"spectrogram.PNG <span style='color:#111;'> 821.59KB </span>","children":null,"spread":false},{"title":"LSTM-result.png <span style='color:#111;'> 46.13KB </span>","children":null,"spread":false}],"spread":false},{"title":"7. separated_result_BLSTM","children":[{"title":"two_women_2.wav <span style='color:#111;'> 83.79KB </span>","children":null,"spread":false},{"title":"two_women_1.wav <span style='color:#111;'> 83.79KB </span>","children":null,"spread":false},{"title":"two_men_2.wav <span style='color:#111;'> 96.79KB </span>","children":null,"spread":false},{"title":"one_man_one_woman_2.wav <span style='color:#111;'> 86.79KB </span>","children":null,"spread":false},{"title":"one_man_one_woman_1.wav <span style='color:#111;'> 86.79KB </span>","children":null,"spread":false},{"title":"two_men_1.wav <span style='color:#111;'> 96.79KB </span>","children":null,"spread":false}],"spread":false},{"title":"5. step_to_CASA_DL","children":[{"title":"train_test_model.py <span style='color:#111;'> 7.83KB </span>","children":null,"spread":false},{"title":"evaluation_metric.py <span style='color:#111;'> 2.19KB </span>","children":null,"spread":false},{"title":"ProjectReport-Speech Separation in Supervised Setting.pdf <span style='color:#111;'> 158.17KB </span>","children":null,"spread":false},{"title":"speech_preprocess.py <span style='color:#111;'> 13.39KB </span>","children":null,"spread":false}],"spread":false},{"title":"1. create-speaker-mixtures-V1","children":[{"title":"mix_2_spk_tt.txt <span style='color:#111;'> 230.49KB </span>","children":null,"spread":false},{"title":"mix_3_spk_cv.txt <span style='color:#111;'> 530.56KB </span>","children":null,"spread":false},{"title":"mix_2_spk_cv.txt <span style='color:#111;'> 374.36KB </span>","children":null,"spread":false},{"title":"mix_3_spk_tt.txt <span style='color:#111;'> 327.25KB </span>","children":null,"spread":false},{"title":"create_wav_2speakers.m <span style='color:#111;'> 7.51KB </span>","children":null,"spread":false},{"title":"create_wav_3speakers.m <span style='color:#111;'> 9.25KB </span>","children":null,"spread":false},{"title":"mix_2_spk_tr.txt <span style='color:#111;'> 1.46MB </span>","children":null,"spread":false},{"title":"readme.txt <span style='color:#111;'> 672B </span>","children":null,"spread":false},{"title":"mix_3_spk_tr.txt <span style='color:#111;'> 2.07MB </span>","children":null,"spread":false}],"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明