Attention-Based-Siamese-Text-CNN-for-Stance-Detection:复旦大学自然语言处理专业(DATA130006)的最终项目

上传者: 42127020 | 上传时间: 2022-04-08 15:15:07 | 文件大小: 14.59MB | 文件类型: ZIP
假新闻挑战 这是NLP课程的最终项目。 我们的工作包括以下几个部分: 数据预处理 常规机器学习方法 Seq2seq注意模型 TextCNN和暹罗网络 其他(例如比赛中的相关工作,未来的工作) 1.数据预处理 我们提供了几种数据预处理方法:BoW(单词袋),TF-IDF,word2vec,doc2vec。 每个py文件都会生成x_1(文档表示形式)x_2(标题表示形式)和y(标签)。 这些数据可以作为间谍数据输出,可以在模型中使用。 2.常规机器学习 我们提供py文件以通过常规机器学习(例如SVM,随机森林)对实例进行分类,代码在sklearn上实现。 环境要求:sklearn numpy 3. Seq2seq注意模型 这些代码通常基于一个带有预训练模型的基于注意力的序列到序列模型( )。 要使用代码生成文本摘要。 运行:python3 run_summarization.py -

文件下载

资源详情

[{"title":"( 50 个子文件 14.59MB ) Attention-Based-Siamese-Text-CNN-for-Stance-Detection:复旦大学自然语言处理专业(DATA130006)的最终项目","children":[{"title":"Attention-Based-Siamese-Text-CNN-for-Stance-Detection-master","children":[{"title":"conventional_ML","children":[{"title":"naive_ml.py <span style='color:#111;'> 10.59KB </span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE <span style='color:#111;'> 1.04KB </span>","children":null,"spread":false},{"title":"TextCNN_Siamese","children":[{"title":"read_data.py <span style='color:#111;'> 564B </span>","children":null,"spread":false},{"title":"util.py <span style='color:#111;'> 1.11KB </span>","children":null,"spread":false},{"title":"main_v5_FineGrain.py <span style='color:#111;'> 8.43KB </span>","children":null,"spread":false},{"title":"TokenizeSentences.py <span style='color:#111;'> 182B </span>","children":null,"spread":false},{"title":"siameseTextCNN.py <span style='color:#111;'> 5.07KB </span>","children":null,"spread":false},{"title":"main_v5.py <span style='color:#111;'> 8.04KB </span>","children":null,"spread":false},{"title":"cnn_loaddata.py <span style='color:#111;'> 2.75KB </span>","children":null,"spread":false},{"title":"siameseTextCNN_v3.py <span style='color:#111;'> 5.09KB </span>","children":null,"spread":false},{"title":"main_v5_FullDoc.py <span style='color:#111;'> 8.46KB </span>","children":null,"spread":false},{"title":"cnn_loaddata_v2.py <span style='color:#111;'> 3.32KB </span>","children":null,"spread":false},{"title":"siameseTextCNN_v2.py <span style='color:#111;'> 5.38KB </span>","children":null,"spread":false},{"title":"main_v5_FullDoc_FineGrain.py <span style='color:#111;'> 8.46KB </span>","children":null,"spread":false},{"title":"fnc_data","children":[{"title":"train_1.txt <span style='color:#111;'> 216.72KB </span>","children":null,"spread":false}],"spread":false}],"spread":false},{"title":"UCL_Repeat","children":[{"title":"util.py <span style='color:#111;'> 11.14KB </span>","children":null,"spread":false},{"title":"model","children":[{"title":"model.checkpoint.index <span style='color:#111;'> 533B </span>","children":null,"spread":false},{"title":"model.checkpoint.data-00000-of-00001 <span style='color:#111;'> 11.45MB </span>","children":null,"spread":false},{"title":"model.checkpoint.meta <span style='color:#111;'> 52.43KB </span>","children":null,"spread":false},{"title":"checkpoint <span style='color:#111;'> 89B </span>","children":null,"spread":false}],"spread":true},{"title":"train_stances_origin.csv <span style='color:#111;'> 4.06MB </span>","children":null,"spread":false},{"title":"data_handle.py <span style='color:#111;'> 1.87KB </span>","children":null,"spread":false},{"title":"train_bodies_origin.csv <span style='color:#111;'> 3.58MB </span>","children":null,"spread":false},{"title":"eval.py <span style='color:#111;'> 964B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 954B </span>","children":null,"spread":false},{"title":"pred.py <span style='color:#111;'> 4.05KB </span>","children":null,"spread":false}],"spread":true},{"title":"final_pj.pdf <span style='color:#111;'> 810.17KB </span>","children":null,"spread":false},{"title":"preprocessing","children":[{"title":"doc2vec.py <span style='color:#111;'> 69B </span>","children":null,"spread":false},{"title":"read_data.py <span style='color:#111;'> 564B </span>","children":null,"spread":false},{"title":"write_doc2vec.py <span style='color:#111;'> 674B </span>","children":null,"spread":false},{"title":"TokenizeSentences.py <span style='color:#111;'> 182B </span>","children":null,"spread":false},{"title":"doc2vec_readdata.py <span style='color:#111;'> 1.78KB </span>","children":null,"spread":false},{"title":"data_preprocessing.py <span style='color:#111;'> 2.73KB </span>","children":null,"spread":false},{"title":"write_txt.py <span style='color:#111;'> 1.36KB </span>","children":null,"spread":false},{"title":"untitled0.py <span style='color:#111;'> 2.18KB </span>","children":null,"spread":false},{"title":"data_pre_v2.py <span style='color:#111;'> 2.56KB </span>","children":null,"spread":false},{"title":"untitled1.py <span style='color:#111;'> 2.75KB </span>","children":null,"spread":false}],"spread":true},{"title":"readme.md <span style='color:#111;'> 2.17KB </span>","children":null,"spread":false},{"title":"attention_based_seq2seq","children":[{"title":"util.py <span style='color:#111;'> 1.77KB </span>","children":null,"spread":false},{"title":"inspect_checkpoint.py <span style='color:#111;'> 1.28KB </span>","children":null,"spread":false},{"title":"LICENSE.txt <span style='color:#111;'> 11.23KB </span>","children":null,"spread":false},{"title":"batcher.py <span style='color:#111;'> 17.72KB </span>","children":null,"spread":false},{"title":"data.py <span style='color:#111;'> 11.00KB </span>","children":null,"spread":false},{"title":"decode.py <span style='color:#111;'> 11.04KB </span>","children":null,"spread":false},{"title":"model.py <span style='color:#111;'> 24.51KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"beam_search.py <span style='color:#111;'> 7.57KB </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 8.09KB </span>","children":null,"spread":false},{"title":"run_summarization.py <span style='color:#111;'> 16.04KB </span>","children":null,"spread":false},{"title":"attention_decoder.py <span style='color:#111;'> 11.87KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明