AG News Dataset 拥有超过 100 万篇新闻文章,其中包含 496,835 条 AG 新闻语料库中超过 2000 个新闻源的文章,该数据集仅采用了标题和描述字段,每种类别均拥有 30,000 个训练样本和 1900 个测试样本。
该数据集由康奈尔大学于 2004 年发布,相关论文有《Ranking a stream of news. In Proceedings of 14th International World Wide Web Conference;The anatomy of a news search engine》。
ag_news_csv,AG is a collection of more than 1 million news articles.training samples is 120,000 and testing 7,600.Each class contains 30,000 training samples and 1,900 testing samples.