========dgk_lost_conv======== chinese conversation corpus 可以用作聊天机器人的训练语料 结果: dgk_shooter_z.conv 110MB 已分词 dgk_shooter_min.conv 按字分词 lost.conv 1.7MB fanzxl.conv 2.3MB fk24.conv 4.5MB haosys.conv 1.3MB juemds.conv 793KB laoyj.conv 1.5MB prisonb.conv 543KB 内部方法: asstosrt -s utf-8 ass ----asstosrt---->srt srt ----cvgen.py---->.conv 特别的shooter73g: 进入shooterwp, 解压缩mirror.x到rawbase下面 执行sel.sh 在跟目录下 fixco
2023-11-09 11:39:30 126.44MB Python
1
European Parliament Proceedings Parallel Corpus 1996-2011 数据集是一个用于统计 机器翻译 的语料库,其中 Europarl 平行语料库来源于欧洲议会的程序,它包括 21 种欧洲语言版本: 罗马语(法语,意大利语,西班牙语,葡萄牙语,罗马尼亚语) 日耳曼语(英语,荷兰语,德语,丹麦语,瑞典语) Slavik(保加利亚语,捷克语,波兰语,斯洛伐克语,斯洛文尼亚语) Finni-Ugric(芬兰语,匈牙利语,爱沙尼亚语) 波罗的海语(拉脱维亚语,立陶宛语) 希腊语 European Parliament Proceedings Parallel Corpus 1996-2011 数据集最初由苏格兰爱丁堡大学信息学院于 2005 年发布,主要发布人为 Philipp Koehn。 该数据集于 2012 年发布第 7 版,相关论文有《Europarl: A Parallel Corpus for Statistical Machine Translation》
2023-03-16 22:52:05 39KB 机器翻译语料库
1
中文人名语料库(Chinese-Names-Corpus) 业余项目“萌名NameMoe(一个基于语料库技术的取名工具)”的副产品。 萌名手机网页测试版: ,欢迎体验。 不定期更新。只删词,不加词。 可用于中文分词、人名识别。 请勿将本库打包上传其他网站挣积分,已上传的请配合删除,谢谢! 中文常见人名(Chinese_Names_Corpus) 数据大小:120万。 语料来源:从亿级人名语料中提取。 数据清洗:已清洗,但仍存有少量badcase。 新增人名生成器。 中文古代人名(Ancient_Names_Corpus) 数据大小:25万。 语料来源:多个人名词典汇总。 数据清洗:已清洗。 中文姓氏(Chinese_Family_Name) 数据大小:1千。 语料来源:从亿级人名语料中提取。 数据清洗:已清洗。 中文称呼(Chinese_Relationship) 数据大小:5千,称呼词根
2023-02-23 16:26:55 17.62MB corpus names dataset dict
1
语音语料库_part_1 TRAIN DR1 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:48 40.05MB 音频数据集
1
语音语料库_part_2 TRAIN DR2 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:48 80.61MB 音频数据集
1
语音语料库_part_3 TRAIN DR3 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:47 80.92MB 音频数据集
1
语音语料库_part_4 TRAIN DR4 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:47 74.42MB 音频数据集
1
语音语料库_part_5 TRAIN DR5 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:46 80.38MB 音频数据集
1
语音语料库_part_6 TRAIN DR6 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:45 38.72MB 音频数据集
1
语音语料库_part_7 TRAIN DR7 TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years.
2022-12-08 11:28:44 81.13MB 音频数据集
1