用python实现TF-IDF算法

上传者: 41065669 | 上传时间: 2023-02-09 15:16:59 | 文件大小: 7.31MB | 文件类型: RAR
包括将txt文件的每个字录入并计算出现次数和计算权重的函数,语料库是大约十万字的66篇论文,tfidf.py中是对文章向量化处理和计算夹角的函数,可以用于文章的分类和论文的查重,由于语料库很少,所以可能结果精度不高。

文件下载

资源详情

[{"title":"( 1551 个子文件 7.31MB ) 用python实现TF-IDF算法","children":[{"title":"59.txt <span style='color:#111;'> 34.60KB </span>","children":null,"spread":false},{"title":"22.txt <span style='color:#111;'> 49.46KB </span>","children":null,"spread":false},{"title":"19.txt <span style='color:#111;'> 34.19KB </span>","children":null,"spread":false},{"title":"60.txt <span style='color:#111;'> 16.51KB </span>","children":null,"spread":false},{"title":"9.txt <span style='color:#111;'> 20.41KB </span>","children":null,"spread":false},{"title":"55.txt <span style='color:#111;'> 37.22KB </span>","children":null,"spread":false},{"title":"21.txt <span style='color:#111;'> 17.12KB </span>","children":null,"spread":false},{"title":"50.txt <span style='color:#111;'> 29.42KB </span>","children":null,"spread":false},{"title":"3.txt <span style='color:#111;'> 77.26KB </span>","children":null,"spread":false},{"title":"32.txt <span style='color:#111;'> 54.33KB </span>","children":null,"spread":false},{"title":"66.txt <span style='color:#111;'> 32.29KB </span>","children":null,"spread":false},{"title":"45.txt <span style='color:#111;'> 61.09KB </span>","children":null,"spread":false},{"title":"44.txt <span style='color:#111;'> 17.56KB </span>","children":null,"spread":false},{"title":"37.txt <span style='color:#111;'> 43.29KB </span>","children":null,"spread":false},{"title":"18.txt <span style='color:#111;'> 22.82KB </span>","children":null,"spread":false},{"title":"49.txt <span style='color:#111;'> 19.30KB </span>","children":null,"spread":false},{"title":"28.txt <span style='color:#111;'> 31.26KB </span>","children":null,"spread":false},{"title":"33.txt <span style='color:#111;'> 27.97KB </span>","children":null,"spread":false},{"title":"54.txt <span style='color:#111;'> 13.12KB </span>","children":null,"spread":false},{"title":"30.txt <span style='color:#111;'> 15.78KB </span>","children":null,"spread":false},{"title":"14.txt <span style='color:#111;'> 36.57KB </span>","children":null,"spread":false},{"title":"63.txt <span style='color:#111;'> 18.37KB </span>","children":null,"spread":false},{"title":"36.txt <span style='color:#111;'> 59.77KB </span>","children":null,"spread":false},{"title":"57.txt <span style='color:#111;'> 38.94KB </span>","children":null,"spread":false},{"title":"12.txt <span style='color:#111;'> 50.82KB </span>","children":null,"spread":false},{"title":"67.txt <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"10.txt <span style='color:#111;'> 24.62KB </span>","children":null,"spread":false},{"title":"7.txt <span style='color:#111;'> 49.55KB </span>","children":null,"spread":false},{"title":"39.txt <span style='color:#111;'> 50.09KB </span>","children":null,"spread":false},{"title":"34.txt <span style='color:#111;'> 9.08KB </span>","children":null,"spread":false},{"title":"2.txt <span style='color:#111;'> 30.14KB </span>","children":null,"spread":false},{"title":"23.txt <span style='color:#111;'> 48.69KB </span>","children":null,"spread":false},{"title":"13.txt <span style='color:#111;'> 33.07KB </span>","children":null,"spread":false},{"title":"6.txt <span style='color:#111;'> 56.20KB </span>","children":null,"spread":false},{"title":"17.txt <span style='color:#111;'> 21.40KB </span>","children":null,"spread":false},{"title":"11.txt <span style='color:#111;'> 36.70KB </span>","children":null,"spread":false},{"title":"35.txt <span style='color:#111;'> 7.38KB </span>","children":null,"spread":false},{"title":"51.txt <span style='color:#111;'> 53.88KB </span>","children":null,"spread":false},{"title":"4.txt <span style='color:#111;'> 31.17KB </span>","children":null,"spread":false},{"title":"43.txt <span style='color:#111;'> 31.08KB </span>","children":null,"spread":false},{"title":"27.txt <span style='color:#111;'> 27.05KB </span>","children":null,"spread":false},{"title":"62.txt <span style='color:#111;'> 77.87KB </span>","children":null,"spread":false},{"title":"46.txt <span style='color:#111;'> 55.60KB </span>","children":null,"spread":false},{"title":"5.txt <span style='color:#111;'> 29.85KB </span>","children":null,"spread":false},{"title":"1.txt <span style='color:#111;'> 17.39KB </span>","children":null,"spread":false},{"title":"26.txt <span style='color:#111;'> 30.89KB </span>","children":null,"spread":false},{"title":"64.txt <span style='color:#111;'> 12.48KB </span>","children":null,"spread":false},{"title":"25.txt <span style='color:#111;'> 14.64KB </span>","children":null,"spread":false},{"title":"48.txt <span style='color:#111;'> 20.92KB </span>","children":null,"spread":false},{"title":"24.txt <span style='color:#111;'> 52.13KB </span>","children":null,"spread":false},{"title":"16.txt <span style='color:#111;'> 52.22KB </span>","children":null,"spread":false},{"title":"47.txt <span style='color:#111;'> 106.97KB </span>","children":null,"spread":false},{"title":"31.txt <span style='color:#111;'> 48.17KB </span>","children":null,"spread":false},{"title":"42.txt <span style='color:#111;'> 28.60KB </span>","children":null,"spread":false},{"title":"65.txt <span style='color:#111;'> 30.44KB </span>","children":null,"spread":false},{"title":"40.txt <span style='color:#111;'> 38.45KB </span>","children":null,"spread":false},{"title":"15.txt <span style='color:#111;'> 53.69KB </span>","children":null,"spread":false},{"title":"61.txt <span style='color:#111;'> 32.27KB </span>","children":null,"spread":false},{"title":"8.txt <span style='color:#111;'> 33.41KB </span>","children":null,"spread":false},{"title":"52.txt <span style='color:#111;'> 42.06KB </span>","children":null,"spread":false},{"title":"58.txt <span style='color:#111;'> 54.51KB </span>","children":null,"spread":false},{"title":"56.txt <span style='color:#111;'> 43.06KB </span>","children":null,"spread":false},{"title":"29.txt <span style='color:#111;'> 52.43KB </span>","children":null,"spread":false},{"title":"53.txt <span style='color:#111;'> 49.15KB </span>","children":null,"spread":false},{"title":"38.txt <span style='color:#111;'> 70.34KB </span>","children":null,"spread":false},{"title":"41.txt <span style='color:#111;'> 11.42KB </span>","children":null,"spread":false},{"title":"20.txt <span style='color:#111;'> 3.73KB </span>","children":null,"spread":false},{"title":"dictionary.txt <span style='color:#111;'> 7.43KB </span>","children":null,"spread":false},{"title":"read_dictionary.cpython-38.pyc <span style='color:#111;'> 591B </span>","children":null,"spread":false},{"title":"txt_turnto_vector.cpython-38.pyc <span style='color:#111;'> 1.70KB </span>","children":null,"spread":false},{"title":"txt_turnto_vector.py <span style='color:#111;'> 2.95KB </span>","children":null,"spread":false},{"title":"INSTALLER <span style='color:#111;'> 4B </span>","children":null,"spread":false},{"title":"entry_points.txt <span style='color:#111;'> 2.68KB </span>","children":null,"spread":false},{"title":"top_level.txt <span style='color:#111;'> 41B </span>","children":null,"spread":false},{"title":"LICENSE <span style='color:#111;'> 1.03KB </span>","children":null,"spread":false},{"title":"METADATA <span style='color:#111;'> 6.07KB </span>","children":null,"spread":false},{"title":"WHEEL <span style='color:#111;'> 92B </span>","children":null,"spread":false},{"title":"RECORD <span style='color:#111;'> 36.84KB </span>","children":null,"spread":false},{"title":"gui-64.exe <span style='color:#111;'> 73.50KB </span>","children":null,"spread":false},{"title":"__init__.cpython-38.pyc <span style='color:#111;'> 2.93KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 2.45KB </span>","children":null,"spread":false},{"title":"script.tmpl <span style='color:#111;'> 138B </span>","children":null,"spread":false},{"title":"build_meta.py <span style='color:#111;'> 18.63KB </span>","children":null,"spread":false},{"title":"_entry_points.py <span style='color:#111;'> 1.93KB </span>","children":null,"spread":false},{"title":"expand.py <span style='color:#111;'> 15.92KB </span>","children":null,"spread":false},{"title":"_apply_pyprojecttoml.py <span style='color:#111;'> 13.08KB </span>","children":null,"spread":false},{"title":"setupcfg.cpython-38.pyc <span style='color:#111;'> 21.38KB </span>","children":null,"spread":false},{"title":"pyprojecttoml.cpython-38.pyc <span style='color:#111;'> 15.73KB </span>","children":null,"spread":false},{"title":"__init__.cpython-38.pyc <span style='color:#111;'> 1.41KB </span>","children":null,"spread":false},{"title":"_apply_pyprojecttoml.cpython-38.pyc <span style='color:#111;'> 13.26KB </span>","children":null,"spread":false},{"title":"expand.cpython-38.pyc <span style='color:#111;'> 17.15KB </span>","children":null,"spread":false},{"title":"__init__.cpython-38.pyc <span style='color:#111;'> 1.46KB </span>","children":null,"spread":false},{"title":"error_reporting.cpython-38.pyc <span style='color:#111;'> 11.41KB </span>","children":null,"spread":false},{"title":"extra_validations.cpython-38.pyc <span style='color:#111;'> 1.40KB </span>","children":null,"spread":false},{"title":"fastjsonschema_validations.cpython-38.pyc <span style='color:#111;'> 68.34KB </span>","children":null,"spread":false},{"title":"formats.cpython-38.pyc <span style='color:#111;'> 8.37KB </span>","children":null,"spread":false},{"title":"fastjsonschema_exceptions.cpython-38.pyc <span style='color:#111;'> 2.41KB </span>","children":null,"spread":false},{"title":"formats.py <span style='color:#111;'> 8.44KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 1.01KB </span>","children":null,"spread":false},{"title":"fastjsonschema_exceptions.py <span style='color:#111;'> 1.57KB </span>","children":null,"spread":false},{"title":"......","children":null,"spread":false},{"title":"<span style='color:steelblue;'>文件过多,未全部展示</span>","children":null,"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明