用于机器学习分类的数据集,鸢尾花分类数据集,文件格式为 .csv。数据完整。总共150个样本数据,标签包括Sepallength, SepalWidth, PetalLength, PetalWidth。
2021-10-10 14:22:46 894B 机器学习 分类 鸢尾花
1
代码已上传至github https://github.com/danan0755/Bert_Classifier 数据来源cnews,可以通过百度云下载 链接:https://pan.baidu.com/s/1LzTidW_LrdYMokN—Nyag 提取码:zejw   数据格式如下: bert中文预训练模型下载地址: 链接:https://pan.baidu.com/s/14JcQXIBSaWyY7bRWdJW7yg 提取码:mvtl 复制run_classifier.py,命名为run_cnews_cls.py。添加自定义的Processor class MyProcessor(D
2021-09-29 12:54:30 185KB 分类 多分类 多分类任务
1
496,835 条来自 AG 新闻语料库 4 大类别超过 2000 个新闻源的新闻文章,数据集仅仅援用了标题和描述字段。每个类别分别拥有 30,000 个训练样本及 1900 个测试样本。 README: AG's News Topic Classification Dataset Version 3, Updated 09/09/2015 ORIGIN AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc), xml, data compression, data streaming, and any other non-commercial activity. For more information, please refer to the link http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html . The AG's news topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the dataset above. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015). DESCRIPTION The AG's news topic classification dataset is constructed by choosing 4 largest classes from the original corpus. Each class contains 30,000 training samples and 1,900 testing samples. The total number of training samples is 120,000 and testing 7,600. The file classes.txt contains a list of classes corresponding to each label. The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 to 4), title and description. The title and description are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is "\n".
2021-07-29 11:51:08 11.24MB 分类任务 AGnews 新闻数据集
1
Python文本数据分析:新闻分类任务 【软件包】 jieba pandas wordcloud matplotlib sklearn 【概念】 IDF:逆文档频率逆文档频率TF-IDF=词(TF)X逆文档频率(IDF)词频(TF)=某词该在文章中出现次数/文出现次数最多的词的出现次数 逆文档频率(IDF) = log(语料库的文档总数/(包含该词的文档数+1)) 【步骤】1、去剥词2、TF-IDF关键词提取3、LDA算法建模4、贝叶斯
2021-07-08 15:02:40 9.2MB 新闻分类任务 Python 数据分析
机器学习课程大作业 - 基于深度神经网络的图像分类任务.7z
2021-06-07 14:00:41 10KB 机器学习课程大作业-基于深度
用于计算机视觉中图像分类等任务,下载之后可本地加载数据集,不用运行时联网下载,减少麻烦
2021-04-12 16:01:51 323.83MB 图像分类 计算机视觉
1
用python动手简易复现了下word2vec中的skip-gram方法,并将嵌入的特征向量与TF-IDF特征和gensim提供的word2vec方法进行了简易对比。 具体内容可参考个人博客。
2021-04-06 16:51:40 35.69MB 算法 word2vec
1
R语言决策树、随机森林、朴素贝叶斯、支持向量机、KNN、BP神经网络
2021-03-28 19:02:55 8KB R语言 分类算法
1
python搭建3层神经网络分类fashionMnist数据集,准确率90%,包含训练集
2021-02-02 02:04:10 84.07MB 深度学习 神经网络 分类算法
1
基于MLP的简单分类任务 平台:VS2015 语言:C++ 固定大小输入
2019-12-21 19:26:02 2.85MB MLP
1