新浪微博爬虫,用python爬取新浪微博数据

上传者: 51986063 | 上传时间: 2023-03-04 11:00:59 | 文件大小: 111KB | 文件类型: ZIP
本程序可以连续爬取一个或多个新浪微博用户(如胡歌、迪丽热巴、郭碧婷)的数据,并将结果信息写入文件或数据库。写入信息几乎包括用户微博的所有数据,包括用户信息和微博信息两大类。因为内容太多,这里不再赘述,详细内容见获取到的字段。如果只需要用户信息,可以通过设置实现只爬取微博用户信息的功能。本程序需设置cookie来获取微博访问权限,后面会讲解如何获取cookie。如果不想设置cookie,可以使用免cookie版,二者功能类似。 爬取结果可写入文件和数据库,具体的写入文件类型如下: txt文件(默认) csv文件(默认) json文件(可选) MySQL数据库(可选) MongoDB数据库(可选) SQLite数据库(可选) 同时支持下载微博中的图片和视频,具体的可下载文件如下: 原创微博中的原始图片(可选) 转发微博中的原始图片(可选) 原创微博中的视频(可选) 转发微博中的视频(可选) 原创微博Live Photo中的视频(免cookie版特有) 转发微博Live Photo中的视频(免cookie版特有)

文件下载

资源详情

[{"title":"( 77 个子文件 111KB ) 新浪微博爬虫,用python爬取新浪微博数据","children":[{"title":"weiboSpider-master","children":[{"title":"setup.py <span style='color:#111;'> 821B </span>","children":null,"spread":false},{"title":".github","children":[{"title":"ISSUE_TEMPLATE","children":[{"title":"failed.md <span style='color:#111;'> 863B </span>","children":null,"spread":false},{"title":"feature-request.md <span style='color:#111;'> 282B </span>","children":null,"spread":false},{"title":"bug-report.md <span style='color:#111;'> 1.26KB </span>","children":null,"spread":false},{"title":"other.md <span style='color:#111;'> 97B </span>","children":null,"spread":false}],"spread":true},{"title":"workflows","children":[{"title":"python-app.yml <span style='color:#111;'> 1.13KB </span>","children":null,"spread":false}],"spread":true},{"title":"stale.yml <span style='color:#111;'> 776B </span>","children":null,"spread":false}],"spread":true},{"title":"tests","children":[{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"test_parser","children":[{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"util.py <span style='color:#111;'> 399B </span>","children":null,"spread":false},{"title":"test_mblog_picAll_parser.py <span style='color:#111;'> 610B </span>","children":null,"spread":false},{"title":"test_index_parser.py <span style='color:#111;'> 559B </span>","children":null,"spread":false},{"title":"test_comment_parser.py <span style='color:#111;'> 1.99KB </span>","children":null,"spread":false},{"title":"test_info_parser.py <span style='color:#111;'> 415B </span>","children":null,"spread":false},{"title":"test_album_parser.py <span style='color:#111;'> 754B </span>","children":null,"spread":false},{"title":"test_photo_parser.py <span style='color:#111;'> 436B </span>","children":null,"spread":false},{"title":"test_page_parser.py <span style='color:#111;'> 1.50KB </span>","children":null,"spread":false}],"spread":true},{"title":"testdata","children":[{"title":"a4437630f3bdfa2757bae1595186ac063fe5ec25cf2f98116ece83cb.html <span style='color:#111;'> 19.92KB </span>","children":null,"spread":false},{"title":"ca5f2a555e8d62f728c66fa90afb2d54d19f8c898e164204a61bdf03.html <span style='color:#111;'> 5.82KB </span>","children":null,"spread":false},{"title":"63a98849ec82b2c87ec55bca03cbf5988f7eac233a23d86b4fdf5ffd.html <span style='color:#111;'> 9.05KB </span>","children":null,"spread":false},{"title":"e4d541ecb02253c14abc1d52605fc00d91279df9ac4c1465c85b91b3.html <span style='color:#111;'> 5.59KB </span>","children":null,"spread":false},{"title":"url_map.json <span style='color:#111;'> 1.36KB </span>","children":null,"spread":false},{"title":"2f62165fa3ca1e85e0d398d385c377a068b76eb95765f7020ffffd3e.html <span style='color:#111;'> 19.88KB </span>","children":null,"spread":false},{"title":"b541fd1751117498b6d6f40d3321686ddf871651237c4ac854a5c3eb.html <span style='color:#111;'> 5.71KB </span>","children":null,"spread":false},{"title":"d486235d4a17dd0accb0f2cc77b3648abfa03580b9e0cdb61f1e618f.html <span style='color:#111;'> 23.72KB </span>","children":null,"spread":false},{"title":"4d5ed0a3ebd0303cb45edd544dbc0ab5e86d43e103405f0c60515884.html <span style='color:#111;'> 14.15KB </span>","children":null,"spread":false},{"title":"4957814af5a123b82e974b5537dea736dfb34e48d8835203a45d2e67.html <span style='color:#111;'> 19.79KB </span>","children":null,"spread":false},{"title":"e97222acd5bc7d8d1bfbd3f352f8cad3e36fdd19e40b69e1c33fb3c3.html <span style='color:#111;'> 3.61KB </span>","children":null,"spread":false},{"title":"76233b3f90394581aac6f19cfa5d674a610e8b442b1f83de7673ab49.html <span style='color:#111;'> 3.63KB </span>","children":null,"spread":false}],"spread":false}],"spread":true},{"title":"CONTRIBUTING.md <span style='color:#111;'> 3.39KB </span>","children":null,"spread":false},{"title":"docs","children":[{"title":"userid.md <span style='color:#111;'> 1.85KB </span>","children":null,"spread":false},{"title":"contributors.md <span style='color:#111;'> 2.21KB </span>","children":null,"spread":false},{"title":"cookie.md <span style='color:#111;'> 797B </span>","children":null,"spread":false},{"title":"automation.md <span style='color:#111;'> 4.25KB </span>","children":null,"spread":false},{"title":"settings.md <span style='color:#111;'> 10.78KB </span>","children":null,"spread":false},{"title":"example.md <span style='color:#111;'> 6.68KB </span>","children":null,"spread":false},{"title":"academic.md <span style='color:#111;'> 837B </span>","children":null,"spread":false},{"title":"FAQ.md <span style='color:#111;'> 4.07KB </span>","children":null,"spread":false}],"spread":true},{"title":"weibo_spider","children":[{"title":"__init__.py <span style='color:#111;'> 0B </span>","children":null,"spread":false},{"title":"logging.conf <span style='color:#111;'> 941B </span>","children":null,"spread":false},{"title":"downloader","children":[{"title":"__init__.py <span style='color:#111;'> 352B </span>","children":null,"spread":false},{"title":"avatar_picture_downloader.py <span style='color:#111;'> 724B </span>","children":null,"spread":false},{"title":"retweet_picture_downloader.py <span style='color:#111;'> 290B </span>","children":null,"spread":false},{"title":"video_downloader.py <span style='color:#111;'> 599B </span>","children":null,"spread":false},{"title":"downloader.py <span style='color:#111;'> 2.27KB </span>","children":null,"spread":false},{"title":"origin_picture_downloader.py <span style='color:#111;'> 290B </span>","children":null,"spread":false},{"title":"img_downloader.py <span style='color:#111;'> 1.34KB </span>","children":null,"spread":false}],"spread":true},{"title":"datetime_util.py <span style='color:#111;'> 259B </span>","children":null,"spread":false},{"title":"user_id_list.txt <span style='color:#111;'> 118B </span>","children":null,"spread":false},{"title":"user.py <span style='color:#111;'> 757B </span>","children":null,"spread":false},{"title":"weibo.py <span style='color:#111;'> 989B </span>","children":null,"spread":false},{"title":"config_util.py <span style='color:#111;'> 6.89KB </span>","children":null,"spread":false},{"title":"__main__.py <span style='color:#111;'> 158B </span>","children":null,"spread":false},{"title":"writer","children":[{"title":"__init__.py <span style='color:#111;'> 357B </span>","children":null,"spread":false},{"title":"mongo_writer.py <span style='color:#111;'> 2.37KB </span>","children":null,"spread":false},{"title":"kafka_writer.py <span style='color:#111;'> 1.28KB </span>","children":null,"spread":false},{"title":"mysql_writer.py <span style='color:#111;'> 5.41KB </span>","children":null,"spread":false},{"title":"json_writer.py <span style='color:#111;'> 1.89KB </span>","children":null,"spread":false},{"title":"csv_writer.py <span style='color:#111;'> 1.90KB </span>","children":null,"spread":false},{"title":"sqlite_writer.py <span style='color:#111;'> 3.80KB </span>","children":null,"spread":false},{"title":"writer.py <span style='color:#111;'> 453B </span>","children":null,"spread":false},{"title":"txt_writer.py <span style='color:#111;'> 2.12KB </span>","children":null,"spread":false}],"spread":false},{"title":"parser","children":[{"title":"comment_parser.py <span style='color:#111;'> 2.18KB </span>","children":null,"spread":false},{"title":"__init__.py <span style='color:#111;'> 213B </span>","children":null,"spread":false},{"title":"info_parser.py <span style='color:#111;'> 2.21KB </span>","children":null,"spread":false},{"title":"util.py <span style='color:#111;'> 3.73KB </span>","children":null,"spread":false},{"title":"page_parser.py <span style='color:#111;'> 15.79KB </span>","children":null,"spread":false},{"title":"mblog_picAll_parser.py <span style='color:#111;'> 389B </span>","children":null,"spread":false},{"title":"album_parser.py <span style='color:#111;'> 621B </span>","children":null,"spread":false},{"title":"photo_parser.py <span style='color:#111;'> 955B </span>","children":null,"spread":false},{"title":"parser.py <span style='color:#111;'> 126B </span>","children":null,"spread":false},{"title":"index_parser.py <span style='color:#111;'> 2.07KB </span>","children":null,"spread":false}],"spread":false},{"title":"config_sample.json <span style='color:#111;'> 912B </span>","children":null,"spread":false},{"title":"spider.py <span style='color:#111;'> 16.57KB </span>","children":null,"spread":false}],"spread":false},{"title":"requirements.txt <span style='color:#111;'> 57B </span>","children":null,"spread":false},{"title":".gitignore <span style='color:#111;'> 96B </span>","children":null,"spread":false},{"title":"README.md <span style='color:#111;'> 16.99KB </span>","children":null,"spread":false}],"spread":true}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明