MScoco 数据集,2014-2015均有
2019-12-21 20:21:29 502B MSCOCO dataset 下载 链接
1
著名的Netflix 智能推荐 百万美金大奖赛使用是数据集. 因为竞赛关闭, Netflix官网上已无法下载. Netflix provided a training data set of 100,480,507 ratings that 480,189 users gave to 17,770 movies. Each training rating is a quadruplet of the form . The user and movie fields are integer IDs, while grades are from 1 to 5 (integral) stars.[3] The qualifying data set contains over 2,817,131 triplets of the form , with grades known only to the jury. A participating team's algorithm must predict grades on the entire qualifying set, but they are only informed of the score for half of the data, the quiz set of 1,408,342 ratings. The other half is the test set of 1,408,789, and performance on this is used by the jury to determine potential prize winners. Only the judges know which ratings are in the quiz set, and which are in the test set—this arrangement is intended to make it difficult to hill climb on the test set. Submitted predictions are scored against the true grades in terms of root mean squared error (RMSE), and the goal is to reduce this error as much as possible. Note that while the actual grades are integers in the range 1 to 5, submitted predictions need not be. Netflix also identified a probe subset of 1,408,395 ratings within the training data set. The probe, quiz, and test data sets were chosen to have similar statistical properties. In summary, the data used in the Netflix Prize looks as follows: Training set (99,072,112 ratings not including the probe set, 100,480,507 including the probe set) Probe set (1,408,395 ratings) Qualifying set (2,817,131 ratings) consisting of: Test set (1,408,789 ratings), used to determine winners Quiz set (1,408,342 ratings), used to calculate leaderboard scores For each movie, title and year of release are provided in a separate dataset. No information at all is provided about users. In order to protect the privacy of customers, "some of the rating data for some customers in the training and qualifyin
2019-12-21 20:17:35 27KB dataset Netflix
1
可以用来学习数据挖掘,机器学习算法的数据集
2019-12-21 20:15:52 127KB dataset
1
MNIST上的手写数字数据集,原网站下载速度贼慢。数据集包含训练集60000条,带标签,测试集10000条。
2019-12-21 20:11:50 11.06MB MNIST dataset
1
MSCOCO dataset下载链接
2019-12-21 20:09:00 502B MSCOCO dataset
1
常用的复杂网络的数据集,包括karate,dolphins,football等 此外部分数据集还提供了相应的论文 针对数据集进行了无向图和有向图以及加权无权的分类,方便使用
2019-12-21 20:04:37 4.94MB 复杂网络 数据集 dataset
1
为segnet的tensorflow实现以及CamVid数据集(包含training、val、test)的下载。
2019-12-21 20:02:29 177.83MB tensorflow segnet image segmentation
1
数据介绍: A pre-classified dataset containing 11,000 web pages from 11 different categories. Although this dataset was designed for unsupervised clustering experiments it can be used for any type web page machine-learning technique. For more information see BankSearch Dataset Page. Submitted by m.p.sinka@rdg.ac.uk. 关键词: 预先分类的数据集,网页,种类,聚类实验, pre-classified dataset,web page,category,clustering experiment, 数据格式: TEXT 数据详细介绍: bankresearch dataset Abstract A pre-classified dataset containing 11,000 web pages from 11 different categories. Although this dataset was designed for unsupervised clustering experiments it can be used for any type web page machine-learning technique. For more information see BankSearch Dataset Page. Submitted by m.p.sinka@rdg.ac.uk. Copyright
2019-12-21 20:00:54 11.36MB 银行数据 搜索数据 数据集 dataset
1
鸢尾花(Iris)数据集是一个著名的统计学资料,被机器学习研究人员大量使用。它包含了150组实例,4种生物特征和每组实例对应的鸢尾花种类(setosa,versicolor,virginica)。
2019-12-21 19:57:14 5KB dataset
1
MNIST 数据集来自美国国家标准与技术研究所, National Institute of Standards and Technology (NIST). 训练集 (training set) 由来自 250 个不同人手写的数字构成, 其中 50% 是高中学生, 50% 来自人口普查局 (the Census Bureau) 的工作人员. 测试集(test set) 也是同样比例的手写数字数据.
2019-12-21 19:46:52 13.19MB mnist 神经网络
1