The fourth day of Crawler learning
爬取58同城
from bs4 import BeautifulSoup
import requests
url = "https://qd.58.com/diannao/35200617992782x.shtml"
web_data = requests.get(url)
soup = BeautifulSoup(web_data.text, 'lxml')
title = soup.title.text
cost = soup.select("div#basicinfo span.infocard__container__item__main__text--price")
time = soup.select(".detail-title__info__text:nth-child(1)")
visitor = soup.select("span#totalcount")
area = soup.select("div.infocard__container__item:nth-child(3)>div.infocard__container__item__main")
who = soup.select("div.infocard__container__item:nth-child(4)>div.infocard__container__item__main")
data = {
"title": title,
"cost": cost[0].get_text().strip(),
"time": time[0].get_text().strip(),
"area": list(area[0].stripped_strings),
"who": who[0].get_text().strip(),
"visitor": visitor[0].get_text().strip()
}
print(data)
{'title': '现货400多台液晶电脑,低价出售,保修一年,可送货,李村附近,需要请联系! - 青岛58同城', 'cost': '350 元', 'time': '2018-08-23 发布', 'area': ['李沧', '-', '李村'], 'who': '李先生', 'visitor': '0'}
The fourth day of Crawler learning的更多相关文章
- The sixth day of Crawler learning
爬取我爱竞赛网的大量数据 首先获取每一种比赛信息的分类链接 def get_type_url(url): web_data = requests.get(web_url) soup = B ...
- The fifth day of Crawler learning
使用mongoDB 下载地址:https://www.mongodb.com/dr/fastdl.mongodb.org/win32/mongodb-win32-x86_64-2008plus-ssl ...
- The third day of Crawler learning
连续爬取多页数据 分析每一页url的关联找出联系 例如虎扑 第一页:https://voice.hupu.com/nba/1 第二页:https://voice.hupu.com/nba/2 第三页: ...
- The second day of Crawler learning
用BeatuifulSoup和Requests爬取猫途鹰网 服务器与本地的交换机制 我们每次浏览网页都是再向网页所在的服务器发送一个Request,然后服务器接受到Request后返回Response ...
- The first day of Crawler learning
使用BeautifulSoup解析网页 Soup = BeautifulSoup(urlopen(html),'lxml') Soup为汤,html为食材,lxml为菜谱 from bs4 impor ...
- Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learning Project
Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learn ...
- Machine and Deep Learning with Python
Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...
- 深度学习论文笔记-Deep Learning Face Representation from Predicting 10,000 Classes
来自:CVPR 2014 作者:Yi Sun ,Xiaogang Wang,Xiaoao Tang 题目:Deep Learning Face Representation from Predic ...
- Machine Learning for Developers
Machine Learning for Developers Most developers these days have heard of machine learning, but when ...
随机推荐
- sql select时增加常量列
阅读更多 string sql="select a,b,'常量' as c from table" 注:单引号' ' 很重要,否则编译时会把其看成查询参数,从而提示参数未指定错误. ...
- 云原生生态周报 Vol. 2
摘要: Cloud Native Weekly China Vol. 2 业界要闻 Kubernetes External Secrets 近日,世界上最大的域名托管公司 Godaddy公司,正式宣布 ...
- 使用DECLARE定义条件和处理程序
定义条件和处理程序是事先定义程序执行过程中可能遇到的问题,并且可以在处理程序中定义解决这些问题的办法,可以简单理解 为异常处理,这种方式可以提前预测可能出现的问题,并提出解决办法,从而增强程序健壮性. ...
- Lists and keys
function NumberList(props) { const numbers = props.numbers; const listItems = numbers.map((number) = ...
- 在ThinkPHP中,if标签和比较标签对于变量的比较。
在TP模板语言中.if和eq都可以用于变量的比较. <比较标签 name="变量" value="值">内容</比较标签> 比如: &l ...
- webstorm破解教程
1.下载地址 官网:https://www.jetbrains.com/webstorm/ 下载好之后按照提示安装即可,这里就不再多说了.下面直接说说如何使用补丁破解. 2.使用补丁破解 (http: ...
- adam调参
微调 #阿尔法 "learning_rate": 3e-5, #学习率衰减 "weight_decay": 0.1,// "weight_decay& ...
- 从浏览器的url中获取查询字符串的参数
正则表达式: function getQuery(name){ var reg = new RegExp("(^|&)" + name + "=([^&] ...
- HTML--CSS样式表--基本概念(超链接的状态)
样式表的基本概念 一.样式表的分类 1.内联样式表 和HTML联合显示,控制精确,但是可重用性差,冗余较多. 例:<p style="font-size:14px;"> ...
- UVa11400 - Lighting System Design——[动态规划]
题干略. 题意分析: 很容易理解一类灯泡要么全部换要么全不换,其实费用节省的主要原因是由于替换灯泡类型而排除了低压电压源,于是我们就可以推断出灯泡类型替换的原则: 对于两类灯泡a1和a2,a1可以被a ...