JDRecommendation:京东商品推荐系统-数据爬虫-源码

上传者: 42117224 | 上传时间: 2021-09-27 09:36:35 | 文件大小: 52KB | 文件类型: ZIP
京东商品推荐系统 数据爬虫部分 本项目用来抓取京东商城的食品区域的商品信息、评价信息和用户数据,数据库采用Mysql。 爬虫的核心模块采用,主要实现了JDPageProcessor类,继承自PageProcessor。 采用XPath和CSS Selector两种模式抽取网页信息。如抽取商品页面用户链接信息: String aHref = html.xpath("div[@class='item']/div[@class='user']/div[@class='u-icon']/a/@href").toString(); 采用的是Xpath抽取方式,过程:提取html中class为item的div中的class为user的div中的class为u-icon中的超链接。`

文件下载

资源详情

[{"title":"( 32 个子文件 52KB ) JDRecommendation:京东商品推荐系统-数据爬虫-源码","children":[{"title":"JDRecommendation-master","children":[{"title":".project <span style='color:#111;'> 367B </span>","children":null,"spread":false},{"title":".gitattributes <span style='color:#111;'> 483B </span>","children":null,"spread":false},{"title":"src","children":[{"title":"com","children":[{"title":"jdpage","children":[{"title":"spider","children":[{"title":"processor","children":[{"title":"JDPageProcessor.java <span style='color:#111;'> 11.85KB </span>","children":null,"spread":false}],"spread":true},{"title":"taskreloader","children":[{"title":"JDTaskReloader.java <span style='color:#111;'> 7.26KB </span>","children":null,"spread":false}],"spread":true},{"title":"pagesaver","children":[{"title":"JDPageSaver.java <span style='color:#111;'> 3.51KB </span>","children":null,"spread":false}],"spread":true},{"title":"dbhelper","children":[{"title":"MysqlHelper.java <span style='color:#111;'> 8.92KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}],"spread":true}],"spread":true},{"title":"log4j.properties <span style='color:#111;'> 2.17KB </span>","children":null,"spread":false}],"spread":true},{"title":".idea","children":[{"title":"misc.xml <span style='color:#111;'> 5.16KB </span>","children":null,"spread":false},{"title":"compiler.xml <span style='color:#111;'> 711B </span>","children":null,"spread":false},{"title":"uiDesigner.xml <span style='color:#111;'> 8.59KB </span>","children":null,"spread":false},{"title":"workspace.xml <span style='color:#111;'> 72.01KB </span>","children":null,"spread":false},{"title":"dataSources.xml <span style='color:#111;'> 778B </span>","children":null,"spread":false},{"title":".name <span style='color:#111;'> 16B </span>","children":null,"spread":false},{"title":"encodings.xml <span style='color:#111;'> 166B </span>","children":null,"spread":false},{"title":"dataSources.ids <span style='color:#111;'> 649B </span>","children":null,"spread":false},{"title":"modules.xml <span style='color:#111;'> 274B </span>","children":null,"spread":false},{"title":"scopes","children":[{"title":"scope_settings.xml <span style='color:#111;'> 139B </span>","children":null,"spread":false}],"spread":false},{"title":"copyright","children":[{"title":"profiles_settings.xml <span style='color:#111;'> 74B </span>","children":null,"spread":false}],"spread":false},{"title":"vcs.xml <span style='color:#111;'> 182B </span>","children":null,"spread":false}],"spread":false},{"title":"README.md <span style='color:#111;'> 704B </span>","children":null,"spread":false},{"title":"JDRecommendation.iml <span style='color:#111;'> 9.48KB </span>","children":null,"spread":false},{"title":".classpath <span style='color:#111;'> 2.04KB </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 485B </span>","children":null,"spread":false},{"title":"bin","children":[{"title":"com","children":[{"title":"jdpage","children":[{"title":"spider","children":[{"title":"processor","children":[{"title":"JDPageProcessor.class <span style='color:#111;'> 10.78KB </span>","children":null,"spread":false},{"title":"JDPageProcessor$State.class <span style='color:#111;'> 1.21KB </span>","children":null,"spread":false}],"spread":true},{"title":"JDTaskReloader.class <span style='color:#111;'> 6.59KB </span>","children":null,"spread":false},{"title":"JDTaskReloader$3.class <span style='color:#111;'> 1.04KB </span>","children":null,"spread":false},{"title":"pagesaver","children":[{"title":"JDPageSaver.class <span style='color:#111;'> 3.84KB </span>","children":null,"spread":false}],"spread":false},{"title":"JDTaskReloader$2.class <span style='color:#111;'> 851B </span>","children":null,"spread":false},{"title":"JDTaskReloader$1.class <span style='color:#111;'> 843B </span>","children":null,"spread":false},{"title":"dbhelper","children":[{"title":"MysqlHelper.class <span style='color:#111;'> 6.65KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}],"spread":true}],"spread":true},{"title":"log4j.properties <span style='color:#111;'> 2.17KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明