punctuation-prediction:用于ASR输出的标点和边界检测的支持工具-源码

上传者: 42133753 | 上传时间: 2021-09-15 10:17:48 | 文件大小: 79KB | 文件类型: ZIP
标点预测 用于ASR输出的标点符号预测的支持工具。 给出或指出了三种模型; Tensorflow 2中的一个基于BERT的Transformer,一个seq2seq Transformer(均使用PyTorch)和一个双向RNN(Punctuator 2, )。 此外,在文件夹process还提供了用于预处理文本以供这些模型使用的代码。 基于BERT的转换器是来自的令牌分类转换器,在这里用于标点符号预测。 序列转换器的序列来自 ,它基于论文中描述的转换器。注意是您所需要的。 我们为转换器提供的所有内容是:1)数据预处理脚本,以获取用于这些模型的正确格式的数据以进行标点预测的任务,以及2)运行文件,其中对这些模型进行了标点预测的培训。 要求和安装 Python版本> = 3.6 NVIDIA GPU和NCCL 对于HuggingFace基于BERT的令牌分类器和Fairseq序列到序

文件下载

资源详情

[{"title":"( 53 个子文件 79KB ) punctuation-prediction:用于ASR输出的标点和边界检测的支持工具-源码","children":[{"title":"punctuation-prediction-master","children":[{"title":"utils","children":[{"title":"seqeval_error_calculator.py <span style='color:#111;'> 1.75KB </span>","children":null,"spread":false},{"title":"error_calculator.py <span style='color:#111;'> 8.83KB </span>","children":null,"spread":false}],"spread":true},{"title":".gitmodules <span style='color:#111;'> 180B </span>","children":null,"spread":false},{"title":"seq2seq","children":[{"title":"ptenv.yml <span style='color:#111;'> 2.65KB </span>","children":null,"spread":false},{"title":"wer-test-seq2seq.sh <span style='color:#111;'> 4.16KB </span>","children":null,"spread":false},{"title":"run-seq2seq.sbatch <span style='color:#111;'> 645B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 608B </span>","children":null,"spread":false},{"title":"generate.sh <span style='color:#111;'> 1.28KB </span>","children":null,"spread":false},{"title":"prepare-data-fairseqNMT.sh <span style='color:#111;'> 2.08KB </span>","children":null,"spread":false},{"title":"run-seq2seq.sh <span style='color:#111;'> 1.43KB </span>","children":null,"spread":false},{"title":"fairseq-punctuate.py <span style='color:#111;'> 1.48KB </span>","children":null,"spread":false}],"spread":true},{"title":"transformers","children":null,"spread":false},{"title":"process","children":[{"title":"wer_assist.py <span style='color:#111;'> 2.97KB </span>","children":null,"spread":false},{"title":"introduce_wer.py <span style='color:#111;'> 634B </span>","children":null,"spread":false},{"title":"write_to_file.py <span style='color:#111;'> 1.93KB </span>","children":null,"spread":false},{"title":"process_text.py <span style='color:#111;'> 1.32KB </span>","children":null,"spread":false},{"title":"rmh_data_cleaning.sh <span style='color:#111;'> 13.24KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 1.82KB </span>","children":null,"spread":false},{"title":"rmh_subset_specific.ipynb <span style='color:#111;'> 7.09KB </span>","children":null,"spread":false},{"title":"preprocess_en_lower.py <span style='color:#111;'> 2.03KB </span>","children":null,"spread":false},{"title":"preprocess_truecase.py <span style='color:#111;'> 2.02KB </span>","children":null,"spread":false},{"title":"europarl_cleaning.sh <span style='color:#111;'> 5.45KB </span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE <span style='color:#111;'> 1.05KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 1.97KB </span>","children":null,"spread":false},{"title":"punctuator2tf2","children":[{"title":"data.py <span style='color:#111;'> 9.44KB </span>","children":null,"spread":false},{"title":"punctuator.py <span style='color:#111;'> 3.42KB </span>","children":null,"spread":false},{"title":"main.py <span style='color:#111;'> 5.08KB </span>","children":null,"spread":false},{"title":"requirements.txt <span style='color:#111;'> 1.12KB </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 8.03KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 2.77KB </span>","children":null,"spread":false},{"title":"play_with_model.py <span style='color:#111;'> 2.70KB </span>","children":null,"spread":false},{"title":"error_calculator.py <span style='color:#111;'> 5.41KB </span>","children":null,"spread":false}],"spread":true},{"title":"tests","children":[{"title":"tests.py <span style='color:#111;'> 965B </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 1B </span>","children":null,"spread":false}],"spread":true},{"title":".gitignore <span style='color:#111;'> 741B </span>","children":null,"spread":false},{"title":"punctuation_package","children":[{"title":"MANIFEST.in <span style='color:#111;'> 99B </span>","children":null,"spread":false},{"title":".DS_Store <span style='color:#111;'> 8.00KB </span>","children":null,"spread":false},{"title":"setup.py <span style='color:#111;'> 1.16KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 4.41KB </span>","children":null,"spread":false},{"title":"punctuator","children":[{"title":".DS_Store <span style='color:#111;'> 6.00KB </span>","children":null,"spread":false},{"title":"main.py <span style='color:#111;'> 1.13KB </span>","children":null,"spread":false},{"title":"example_input.txt <span style='color:#111;'> 1.54KB </span>","children":null,"spread":false},{"title":"path_config.json <span style='color:#111;'> 29B </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 154B </span>","children":null,"spread":false},{"title":"models.py <span style='color:#111;'> 8.80KB </span>","children":null,"spread":false},{"title":"LICENSE <span style='color:#111;'> 1.12KB </span>","children":null,"spread":false},{"title":"api.py <span style='color:#111;'> 11.18KB </span>","children":null,"spread":false}],"spread":false}],"spread":true},{"title":"BERTbased","children":[{"title":"utils_punctuation.py <span style='color:#111;'> 15.67KB </span>","children":null,"spread":false},{"title":"run.sh <span style='color:#111;'> 5.21KB </span>","children":null,"spread":false},{"title":"predict.py <span style='color:#111;'> 5.08KB </span>","children":null,"spread":false},{"title":"wer-test.sh <span style='color:#111;'> 3.70KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 2.01KB </span>","children":null,"spread":false},{"title":"run_punctuation.py <span style='color:#111;'> 11.60KB </span>","children":null,"spread":false},{"title":"predict_for_scoring.py <span style='color:#111;'> 5.37KB </span>","children":null,"spread":false}],"spread":true}],"spread":false}],"spread":true}]

评论信息

  • weixin_42547399 :
    用户下载后在一定时间内未进行评价,系统默认好评。
    2021-08-11

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明