Python3爬虫批量爬取图片并保存到本地
看新闻的时候忽然发现了一个图片网站,那肯定得爬一下。
网址:https://www.0xu.cn/
不难发现,qcmn这个路径对应青春美女
右键检查图片地址可见
访问该地址成功访问到了图片
正式开始
第一步:请求网页并分析返回包提取图片url地址。
检查发现qcmn第一张图片对应路径3087
右键检查network搜索对应请求
发现返回包是一段json
一、先写一个获取URL的函数
import requests
import json
import re
page=1
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path
def get_html(url):
html = requests.get(url)
html=html.text
print(html)
返回结果:
{"response":{"status":1,"message":"请求成功","data":{"page":1,"page_size":10,"totalPage":56,"list":[{"id":3807,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若:又回到这座灯火阑珊的旧城,只是再也没有你陪我的黄昏。","summary":"情绪低落时,现实的你,网络上的你,一个假装快乐,一个真心难过。","state":0,"browse":17,"created":"2020-12-22T03:03:08+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9171,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/1a2ebb29df8634ea18fab0438aef067c.jpg"},{"id":9172,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/48260abe57dabd163a824eb0de8f540d.jpg"}],"tags":null},{"id":3806,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若:在某个时刻(存照) ","summary":"人总是这样,终于到了懂得珍惜的年纪,却偏偏什么都走散了。","state":0,"browse":9,"created":"2020-12-22T03:02:23+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9170,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/182ea3276e49c000284f914dfe5ee4de.jpg"}],"tags":null},{"id":3805,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若:他知道你舍不得离开才肆无忌惮的伤害。","summary":"这一生,这一世,因为不再有你,所以爱情轰然老去。","state":0,"browse":5,"created":"2020-12-22T03:01:50+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9169,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/09b79f16080a773eb25b207c76c2eca6.jpg"}],"tags":null},{"id":3804,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若Abby:年头觉得自己还蛮好看的 年尾怎么就长残了","summary":"一个人躲在角落里,悄悄的落泪,因为没有人值得我倾诉。","state":0,"browse":6,"created":"2020-12-22T03:01:22+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9168,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/8e190cb74b4b0f0ba526935b17aee9d8.jpg"}],"tags":null},{"id":3803,"uid":"385c1458d707bd373cc47048f7f5be22","title":"林若:怀念半年前傻傻憨憨 快快乐乐的自己 ","summary":"有裂痕的愛怎麽重蓋、悲傷要怎麽平靜純白。","state":0,"browse":7,"created":"2020-12-22T03:00:07+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9167,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/91b162607c9d204916f3d0d803ffcb75.jpg"}],"tags":null},{"id":3800,"uid":"385c1458d707bd373cc47048f7f5be22","title":"杏子大人:情书也撕了,酒杯也碎了,别担心,你走吧,我不爱你了","summary":"很多事情没有来日方长,很多人都只会乍然离场。一组伤感的个性签名送给你们,希望伤心难过时能带给你一丝慰藉。","state":0,"browse":7,"created":"2020-12-22T02:57:09+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9164,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/b7ef615e0c03c602095d0092d370ba5c.jpg"}],"tags":null},{"id":3799,"uid":"385c1458d707bd373cc47048f7f5be22","title":"杏子大人:微笑就像创可贴,掩饰了伤口,痛还在。","summary":"你的名字,写下来不过几厘米那么短,却贯穿了我那么长的时光。","state":0,"browse":9,"created":"2020-12-22T02:56:24+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9162,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a04b64835cdb0e8a666caec25bb64d3.jpg"},{"id":9163,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/74af547d45e9e329497cd10272608686.jpg"}],"tags":null},{"id":3793,"uid":"385c1458d707bd373cc47048f7f5be22","title":"杏子大人:更可笑的是我瞒着所有人继续爱了你好久。","summary":"当身边结婚的朋友越来越多,当朋友圈晒娃的越来越多,有时候想想,大概是等不到你了。","state":0,"browse":2,"created":"2020-12-22T02:51:21+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9147,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7000fb6c54306bc2cd46bc593582fe70.jpg"},{"id":9148,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7e59bcd7d53ab6fe7dfc11cc58d7cb51.jpg"},{"id":9149,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/ffa75d59348399f5495fbb98eae8d12d.jpg"}],"tags":null},{"id":3792,"uid":"385c1458d707bd373cc47048f7f5be22","title":"杏子大人:不敢再炫耀身边有谁,害怕你突然间离开让我尴尬。","summary":"可不可以不要靠近我,了解我,心疼我,然后再离开我。","state":0,"browse":3,"created":"2020-12-22T02:50:06+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9145,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a0c2d7c44d85a7beb9d0e553c051d2d.jpg"},{"id":9146,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/6fd1dc7a272476d1dafba1d266e65c50.jpg"}],"tags":null},{"id":3791,"uid":"385c1458d707bd373cc47048f7f5be22","title":"就是LYN:时光让我们相聚,时光却也让我们分离。","summary":"我走不进你的心,写不出你的梦,不管我付出多少都是个外人。","state":0,"browse":5,"created":"2020-12-22T02:49:14+08:00","classify":{"id":2,"name":"清纯美女","slug":"qcmn"},"author":{"uid":"385c1458d707bd373cc47048f7f5be22","author_name":"kstest","author_avatar":"https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg","login_name":"kstest","phone":"17711067850","email":"","create_time":1516856296,"flag":0,"description":""},"pictures":[{"id":9143,"img_url":"https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/f863993340a222f2911d8c039ae1f58a.jpg"}],"tags":null}]}}}
这是一段json为了方便观看,可以找一个json在线解析网站
我们需要的内容都在data内
接下来要掐头去尾获取我们需要的内容并将json转换为Python字典
import requests
import json
import re
page=1
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path
def get_html(url):
html = requests.get(url)
html=html.text
#掐头去尾
html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
# 将 JSON 对象转换为 Python 字典
html = json.loads(html)
print(type(html))
print(html)
get_html(url)
运行结果:
{'page': 1, 'page_size': 10, 'totalPage': 56, 'list': [{'id': 3807, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:又回到这座灯火阑珊的旧城,只是再也没有你陪我的黄昏。', 'summary': '情绪低落时,现实的你,网络上的你,一个假装快乐,一个真心难过。', 'state': 0, 'browse': 17, 'created': '2020-12-22T03:03:08+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9171, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/1a2ebb29df8634ea18fab0438aef067c.jpg'}, {'id': 9172, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/48260abe57dabd163a824eb0de8f540d.jpg'}], 'tags': None}, {'id': 3806, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:在某个时刻(存照) \u200b\u200b\u200b\u200b', 'summary': '人总是这样,终于到了懂得珍惜的年纪,却偏偏什么都走散了。', 'state': 0, 'browse': 9, 'created': '2020-12-22T03:02:23+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9170, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/182ea3276e49c000284f914dfe5ee4de.jpg'}], 'tags': None}, {'id': 3805, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:他知道你舍不得离开才肆无忌惮的伤害。', 'summary': '这一生,这一世,因为不再有你,所以爱情轰然老去。', 'state': 0, 'browse': 5, 'created': '2020-12-22T03:01:50+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9169, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/09b79f16080a773eb25b207c76c2eca6.jpg'}], 'tags': None}, {'id': 3804, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若Abby:年头觉得自己还蛮好看的 年尾怎么就长残了', 'summary': '一个人躲在角落里,悄悄的落泪,因为没有人值得我倾诉。', 'state': 0, 'browse': 6, 'created': '2020-12-22T03:01:22+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9168, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/8e190cb74b4b0f0ba526935b17aee9d8.jpg'}], 'tags': None}, {'id': 3803, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:怀念半年前傻傻憨憨 快快乐乐的自己 \u200b\u200b\u200b\u200b', 'summary': '有裂痕的愛怎麽重蓋、悲傷要怎麽平靜純白。', 'state': 0, 'browse': 7, 'created': '2020-12-22T03:00:07+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9167, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/91b162607c9d204916f3d0d803ffcb75.jpg'}], 'tags': None}, {'id': 3800, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:情书也撕了,酒杯也碎了,别担心,你走吧,我不爱你了', 'summary': '很多事情没有来日方长,很多人都只会乍然离场。一组伤感的个性签名送给你们,希望伤心难过时能带给你一丝慰藉。', 'state': 0, 'browse': 7, 'created': '2020-12-22T02:57:09+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9164, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/b7ef615e0c03c602095d0092d370ba5c.jpg'}], 'tags': None}, {'id': 3799, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:微笑就像创可贴,掩饰了伤口,痛还在。', 'summary': '你的名字,写下来不过几厘米那么短,却贯穿了我那么长的时光。', 'state': 0, 'browse': 9, 'created': '2020-12-22T02:56:24+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9162, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a04b64835cdb0e8a666caec25bb64d3.jpg'}, {'id': 9163, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/74af547d45e9e329497cd10272608686.jpg'}], 'tags': None}, {'id': 3793, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:更可笑的是我瞒着所有人继续爱了你好久。', 'summary': '当身边结婚的朋友越来越多,当朋友圈晒娃的越来越多,有时候想想,大概是等不到你了。', 'state': 0, 'browse': 2, 'created': '2020-12-22T02:51:21+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9147, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7000fb6c54306bc2cd46bc593582fe70.jpg'}, {'id': 9148, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7e59bcd7d53ab6fe7dfc11cc58d7cb51.jpg'}, {'id': 9149, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/ffa75d59348399f5495fbb98eae8d12d.jpg'}], 'tags': None}, {'id': 3792, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:不敢再炫耀身边有谁,害怕你突然间离开让我尴尬。', 'summary': '可不可以不要靠近我,了解我,心疼我,然后再离开我。', 'state': 0, 'browse': 3, 'created': '2020-12-22T02:50:06+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9145, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a0c2d7c44d85a7beb9d0e553c051d2d.jpg'}, {'id': 9146, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/6fd1dc7a272476d1dafba1d266e65c50.jpg'}], 'tags': None}, {'id': 3791, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '就是LYN:时光让我们相聚,时光却也让我们分离。', 'summary': '我走不进你的心,写不出你的梦,不管我付出多少都是个外人。', 'state': 0, 'browse': 5, 'created': '2020-12-22T02:49:14+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9143, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/f863993340a222f2911d8c039ae1f58a.jpg'}], 'tags': None}]}
我们要的内容再list中
获取list对应的值:
import requests
import json
import re
page=1
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path
def get_html(url):
html = requests.get(url)
html=html.text
#掐头去尾
html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
# 将 JSON 对象转换为 Python 字典
html = json.loads(html)
# print(type(html))
# print(html)
list1=html['list']
print(len(list1))
print(list1)
# for dict1 in list1:
# print(type(dict1))
# list2=dict1['pictures']
# print(list2)
# print(type(list2))
get_html(url)
返回结果为:
[{'id': 3807, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:又回到这座灯火阑珊的旧城,只是再也没有你陪我的黄昏。', 'summary': '情绪低落时,现实的你,网络上的你,一个假装快乐,一个真心难过。', 'state': 0, 'browse': 17, 'created': '2020-12-22T03:03:08+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9171, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/1a2ebb29df8634ea18fab0438aef067c.jpg'}, {'id': 9172, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/48260abe57dabd163a824eb0de8f540d.jpg'}], 'tags': None}, {'id': 3806, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:在某个时刻(存照) \u200b\u200b\u200b\u200b', 'summary': '人总是这样,终于到了懂得珍惜的年纪,却偏偏什么都走散了。', 'state': 0, 'browse': 9, 'created': '2020-12-22T03:02:23+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9170, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/182ea3276e49c000284f914dfe5ee4de.jpg'}], 'tags': None}, {'id': 3805, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:他知道你舍不得离开才肆无忌惮的伤害。', 'summary': '这一生,这一世,因为不再有你,所以爱情轰然老去。', 'state': 0, 'browse': 5, 'created': '2020-12-22T03:01:50+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9169, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/09b79f16080a773eb25b207c76c2eca6.jpg'}], 'tags': None}, {'id': 3804, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若Abby:年头觉得自己还蛮好看的 年尾怎么就长残了', 'summary': '一个人躲在角落里,悄悄的落泪,因为没有人值得我倾诉。', 'state': 0, 'browse': 6, 'created': '2020-12-22T03:01:22+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9168, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/8e190cb74b4b0f0ba526935b17aee9d8.jpg'}], 'tags': None}, {'id': 3803, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '林若:怀念半年前傻傻憨憨 快快乐乐的自己 \u200b\u200b\u200b\u200b', 'summary': '有裂痕的愛怎麽重蓋、悲傷要怎麽平靜純白。', 'state': 0, 'browse': 7, 'created': '2020-12-22T03:00:07+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9167, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/91b162607c9d204916f3d0d803ffcb75.jpg'}], 'tags': None}, {'id': 3800, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:情书也撕了,酒杯也碎了,别担心,你走吧,我不爱你了', 'summary': '很多事情没有来日方长,很多人都只会乍然离场。一组伤感的个性签名送给你们,希望伤心难过时能带给你一丝慰藉。', 'state': 0, 'browse': 7, 'created': '2020-12-22T02:57:09+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9164, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/b7ef615e0c03c602095d0092d370ba5c.jpg'}], 'tags': None}, {'id': 3799, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:微笑就像创可贴,掩饰了伤口,痛还在。', 'summary': '你的名字,写下来不过几厘米那么短,却贯穿了我那么长的时光。', 'state': 0, 'browse': 9, 'created': '2020-12-22T02:56:24+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9162, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a04b64835cdb0e8a666caec25bb64d3.jpg'}, {'id': 9163, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/74af547d45e9e329497cd10272608686.jpg'}], 'tags': None}, {'id': 3793, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:更可笑的是我瞒着所有人继续爱了你好久。', 'summary': '当身边结婚的朋友越来越多,当朋友圈晒娃的越来越多,有时候想想,大概是等不到你了。', 'state': 0, 'browse': 2, 'created': '2020-12-22T02:51:21+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9147, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7000fb6c54306bc2cd46bc593582fe70.jpg'}, {'id': 9148, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7e59bcd7d53ab6fe7dfc11cc58d7cb51.jpg'}, {'id': 9149, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/ffa75d59348399f5495fbb98eae8d12d.jpg'}], 'tags': None}, {'id': 3792, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '杏子大人:不敢再炫耀身边有谁,害怕你突然间离开让我尴尬。', 'summary': '可不可以不要靠近我,了解我,心疼我,然后再离开我。', 'state': 0, 'browse': 3, 'created': '2020-12-22T02:50:06+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9145, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/7a0c2d7c44d85a7beb9d0e553c051d2d.jpg'}, {'id': 9146, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/6fd1dc7a272476d1dafba1d266e65c50.jpg'}], 'tags': None}, {'id': 3791, 'uid': '385c1458d707bd373cc47048f7f5be22', 'title': '就是LYN:时光让我们相聚,时光却也让我们分离。', 'summary': '我走不进你的心,写不出你的梦,不管我付出多少都是个外人。', 'state': 0, 'browse': 5, 'created': '2020-12-22T02:49:14+08:00', 'classify': {'id': 2, 'name': '清纯美女', 'slug': 'qcmn'}, 'author': {'uid': '385c1458d707bd373cc47048f7f5be22', 'author_name': 'kstest', 'author_avatar': 'https://imgs.aideep.com/img/0xu/2020/7/30/53bda5c29ec82f150c297ae2193c8191.jpg', 'login_name': 'kstest', 'phone': '17711067850', 'email': '', 'create_time': 1516856296, 'flag': 0, 'description': ''}, 'pictures': [{'id': 9143, 'img_url': 'https://imgs.knowsafe.com:8087/img/0xuoldgallery/2020-12-22/f863993340a222f2911d8c039ae1f58a.jpg'}], 'tags': None}]
返回的是一个列表
接下来遍历list中的索引,可以看出,列表中的每一个对应一个字典
for循环遍历所有字典并取pictures对应的值
对应代码:
import requests
import json
import re
page=1
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path
def get_html(url):
html = requests.get(url)
html=html.text
#掐头去尾
html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
# 将 JSON 对象转换为 Python 字典
html = json.loads(html)
# print(type(html))
# print(html)
#获取字典中list对应的值(是一个列表list1)
list1=html['list']
# print(type(list1))
#
# print(len(list1))
# print(list1)
#for循环遍历列表list1中所有的值(也就是每一组字典)
for dict1 in list1:
print(type(dict1))
#从字典中获取pictures对应的值(img_url在pictures中)
list2=dict1['pictures']
#print(list2)
print(type(list2))
get_html(url)
运行结果:
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
<class 'list'>
list2是列表的格式
然后再遍历list2中的所有值,并取出img_url
import requests
import json
import re
page=2
path='qcmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path
def get_html(url):
html = requests.get(url)
html=html.text
#掐头去尾
html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
# 将 JSON 对象转换为 Python 字典
html = json.loads(html)
# print(type(html))
# print(html)
#获取字典中list对应的值(是一个列表list1)
list1=html['list']
# print(type(list1))
#
# print(len(list1))
# print(list1)
#for循环遍历列表list1中所有的值(也就是每一组字典)
for dict1 in list1:
#print(type(dict1))
#从字典中获取pictures对应的值(img_url在pictures中)
list2=dict1['pictures']
print()
#print(list2)
#print(type(list2))
for img_urls in list2:
img_url=img_urls['img_url']
print(img_url)
get_html(url)
运行结果:
**
二、爬取图片链接并保存到本地
**
#遍历list2中的所有值(字典),并取出img_url对应的值
for img_urls in list2:
img_url=img_urls['img_url']
i = i + 1
#print(i)
#print(img_url)
try:
pic = requests.get(img_url, timeout=10)
with open('./images/{0}.jpg'.format(str(i)),"wb") as f:
print("正在下载第{0}张照片:".format(str(i)))
f.write(pic.content)
f.close()
except requests.exceptions.ConnectionError:
print('当前图片无法下载')
continue
运行结果:
然后更换路径为长腿美女也可以。
搞完之后我发现有点尴尬,多此一举了。。
https://www.0xu.cn/gallery/qcmn/1
直接爆破这个参数就行。(每个类型1-3500个差不多)
import requests
import json
import re
import os
page=1
path='ctmn'
url='https://www.0xu.cn/gallery/'+path
url='https://www.0xu.cn/gallery/list?page='+str(page) +'&category='+path
def get_html(url):
i = 0
html = requests.get(url)
html=html.text
#掐头去尾
html = html.replace('{"response":{"status":1,"message":"请求成功","data":', '').replace('}}', '')
# 将 JSON 对象转换为 Python 字典
html = json.loads(html)
# print(type(html))
# print(html)
#获取字典中list对应的值(是一个列表list1)
list1=html['list']
# print(type(list1))
#
# print(len(list1))
# print(list1)
#for循环遍历列表list1中所有的值(也就是每一组字典)
for dict1 in list1:
#print(type(dict1))
#从字典中获取pictures对应的值(img_url在pictures中)
list2=dict1['pictures']
print()
#print(list2)
#print(type(list2))
#遍历list2中的所有值(字典),并取出img_url对应的值
for img_urls in list2:
img_url=img_urls['img_url']
i = i + 1
#print(i)
#print(img_url)
try:
pic = requests.get(img_url, timeout=10)
lujin = './images/'
if not os.path.isdir(lujin):
os.makedirs(lujin)
with open(lujin +'{0}.jpg'.format(str(i)),"wb") as f:
print("正在下载第{0}张照片:".format(str(i)))
f.write(pic.content)
f.close()
except requests.exceptions.ConnectionError:
print('当前图片无法下载')
continue
get_html(url)
热爱网络安全和python的小伙伴可以关注下我的公众号。
上边是完整的代码。后续改一版会放在公众号上,预计这周末有时间。
Python3爬虫批量爬取图片并保存到本地的更多相关文章
- 使用原生php爬取图片并保存到本地
通过一个简单的例子复习一下几个php函数的用法 用到的函数或知识点 curl 发送网络请求 preg_match 正则匹配 代码 $url = 'http://desk.zol.com.cn/bizh ...
- from appium import webdriver 使用python爬虫,批量爬取抖音app视频(requests+Fiddler+appium)
使用python爬虫,批量爬取抖音app视频(requests+Fiddler+appium) - 北平吴彦祖 - 博客园 https://www.cnblogs.com/stevenshushu/p ...
- 【python爬虫】对喜马拉雅上一个专辑的音频进行爬取并保存到本地
>>>内容基本框架: 1.爬虫目的 2.爬取过程 3.代码实现 4.爬取结果 >>>实验环境: python3.6版本,pycharm,电脑可上网. [一 爬虫目 ...
- 【Python3爬虫】爬取美女图新姿势--Redis分布式爬虫初体验
一.写在前面 之前写的爬虫都是单机爬虫,还没有尝试过分布式爬虫,这次就是一个分布式爬虫的初体验.所谓分布式爬虫,就是要用多台电脑同时爬取数据,相比于单机爬虫,分布式爬虫的爬取速度更快,也能更好地应对I ...
- scrapy爬虫,爬取图片
一.scrapy的安装: 本文基于Anacoda3, Anacoda2和3如何同时安装? 将Anacoda3安装在C:\ProgramData\Anaconda2\envs文件夹中即可. 如何用con ...
- python爬虫(爬取图片)
python爬虫爬图片 爬虫爬校花网校花的图片 第一步 载入爬虫模块 #载入爬虫模块 import re #载入爬虫模块 import requests #载入爬虫模块 第二步 获得校花网的地址,获得 ...
- 使用python爬虫,批量爬取抖音app视频(requests+Fiddler+appium)
抖音很火,楼主使用python随机爬取抖音视频,并且无水印下载,人家都说天下没有爬不到的数据,so,楼主决定试试水,纯属技术爱好,分享给大家.. 1.楼主首先使用Fiddler4来抓取手机抖音app这 ...
- pymysql 使用twisted异步插入数据库:基于crawlspider爬取内容保存到本地mysql数据库
本文的前提是实现了整站内容的抓取,然后把抓取的内容保存到数据库. 可以参考另一篇已经实现整站抓取的文章:Scrapy 使用CrawlSpider整站抓取文章内容实现 本文也是基于这篇文章代码基础上实现 ...
- 【知识积累】使用Httpclient实现网页的爬取并保存至本地
程序功能实现了爬取网页页面并且将结果保存到本地,通过以爬取页面出发,做一个小的爬虫,分析出有利于自己的信息,做定制化的处理. 其中需要的http*的jar文件,可以在网上自行下载 import jav ...
- WebMagic 抓取图片并保存至本地
1.近期接触到java 爬虫,开源的爬虫框架有很多,其中WebMagic 是国产的,文档也是中文的,网上资料很多,便于学习,功能强大,可以在很短时间内实现一个简单的网络爬虫.具体可参考官网 http: ...
随机推荐
- 一款基于 WPF 开源、功能全面的串口调试工具
前言 今天大姚给大家分享一款基于 WPF 开源(MIT License).免费.功能全面的串口调试工具:BYSerial. 项目介绍 BYSerial是一款基于 WPF 开源(MIT License) ...
- windows系统之netstatt、telnet、tasklist taskkill四大常用网络运维命令
windows系统之netstatt.telnet.tasklist taskkill 四大常用网络运维命令 Netstat 查看网络状态信息 [用法格式] NETSTAT [-a] [-b] [- ...
- 天线增益是什么意思?DBI越大越好吗?
作者: 技象物联网/ 行业百科 / 无线通信, 通信系统 / 2023年6月23日 11:58:21 天线增益是指天线在某个方向上相对于一个理想点源天线所辐射的功率密度的增益.换句话说,天线增益是天线 ...
- keycloak~refresh_token的标准化
内容大纲 refresh_token作用 使用方法 refresh_token规范 keycloak开启refresh_token的限制 refresh_token时的错误汇总 keycloak中re ...
- 一文搞懂 APP 算法备案
今天来给大家好好科普一下超重要的 APP 算法备案,这可是和我们日常使用 APP 以及 APP 运营都息息相关的知识点哦! 什么是算法备案 简单来讲,算法备案就相当于 APP 运营者要把自家 APP ...
- Linux - 内核版本升级
测试时间:2024年5月15日,本文测试CentOS7.9的内核版本升级 测试结论:不要选择手动编译的方式!!! 一.使用第三方仓库(ELRepo) (1)升级前内核查看(3.10.0-1160.el ...
- 【渗透测试】Vulnhub DarkHole
渗透环境 攻击机: IP: 192.168.216.129(Kali) 靶机: IP:192.168.216.130 靶机下载地址:https://www.vulnhub.com/entr ...
- 使用mybatis-plus转换枚举值
1. 使用mybatis-plus转换枚举值 枚举值转换方式有很多,有以下方式: 后端写一个通用方法,只要前端传枚举类型,后端返回相应的枚举值前端去匹配 优点:能够实时保持数据一致性 缺点:如果有大量 ...
- Chrome 133 里程碑式更新 - moveBefore, 或开启前端框架未来新纪元?
相关背景: Chrome 133 版本(将于 2 月 4 日发布稳定版)引入了一个新的 DOM 操作方法:Node.prototype.moveBefore.这一方法虽然看似简单,但其意义重大,因为它 ...
- 洛谷P11250 [GESP202409 八级] 手套配对 题解
题目传送门. 非常简单的组合数学题. 首先从 \(n\) 对手套中恰好选出 \(k\) 对手套的方案数为 \(C_n^k\),然后由于我们要取出 \(m\) 只手套,那么取了 \(k\) 对手套后还要 ...