机器学习基于BERT和朴素贝叶斯算法的新闻文本分类项目(源码+大作业+数据集).zip

上传者: 55305220 | 上传时间: 2022-06-12 11:06:40 | 文件大小: 67.38MB | 文件类型: ZIP
机器学习课设大作业基于BERT和朴素贝叶斯算法的新闻文本分类项目(源码+大作业+数据集)。一个很完整的项目源码,操作没难度,新手也可放心下载。 data文件夹中已经包括了初始的数据集和处理之后的数据集。.csv文件初始数据集,另外两个是经过News_prediction.ipynb代码处理过的。在Bert和NaiveBayes训练函数中直接加载上来。 result文件夹中的文件是朴素贝叶斯和Bert模型训练后的输出文件 互联网假新闻分类 一共三种类别:真新闻、假新闻、不用判断 40000条训练文本数据,10000条测试数据。 基于正则表达式和Jieba完成特征工程 朴素贝叶斯:tf-idf词嵌入。87.4% BERT:cn-wmm预训练词向量。5-epoch,91.4%

文件下载

资源详情

[{"title":"( 23 个子文件 67.38MB ) 机器学习基于BERT和朴素贝叶斯算法的新闻文本分类项目(源码+大作业+数据集).zip","children":[{"title":"TEXTSPLIT-main","children":[{"title":"CS1906吴官骏实验报告.docx <span style='color:#111;'> 773.41KB </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 203B </span>","children":null,"spread":false},{"title":"kaggle_bertversion.ipynb <span style='color:#111;'> 14.33KB </span>","children":null,"spread":false},{"title":"result","children":[{"title":"result_bayes.txt <span style='color:#111;'> 22.55KB </span>","children":null,"spread":false},{"title":"result_bert.txt <span style='color:#111;'> 22.76KB </span>","children":null,"spread":false}],"spread":true},{"title":"README.md <span style='color:#111;'> 1.81KB </span>","children":null,"spread":false},{"title":"News_preprocessing_steps.ipynb <span style='color:#111;'> 2.59MB </span>","children":null,"spread":false},{"title":"data.ipynb <span style='color:#111;'> 30.67KB </span>","children":null,"spread":false},{"title":"split_dataset <span style='color:#111;'> 38.32MB </span>","children":null,"spread":false},{"title":"Bert_train.ipynb <span style='color:#111;'> 6.59KB </span>","children":null,"spread":false},{"title":"split_testset <span style='color:#111;'> 9.56MB </span>","children":null,"spread":false},{"title":"News_prediction.ipynb <span style='color:#111;'> 2.58MB </span>","children":null,"spread":false},{"title":".ipynb_checkpoints","children":[{"title":"data-checkpoint.ipynb <span style='color:#111;'> 72B </span>","children":null,"spread":false},{"title":"News_prediction-checkpoint.ipynb <span style='color:#111;'> 1.33MB </span>","children":null,"spread":false}],"spread":true},{"title":"dataload.py <span style='color:#111;'> 5.18KB </span>","children":null,"spread":false},{"title":"data","children":[{"title":"test.csv <span style='color:#111;'> 6.98MB </span>","children":null,"spread":false},{"title":"split_dataset <span style='color:#111;'> 38.32MB </span>","children":null,"spread":false},{"title":"test <span style='color:#111;'> 9.56MB </span>","children":null,"spread":false},{"title":"train.csv <span style='color:#111;'> 27.94MB </span>","children":null,"spread":false},{"title":"train <span style='color:#111;'> 38.32MB </span>","children":null,"spread":false},{"title":"split_testset <span style='color:#111;'> 9.56MB </span>","children":null,"spread":false}],"spread":false},{"title":"Log.txt <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"NaiveBayes.ipynb <span style='color:#111;'> 14.87KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明