爬取58同城

from bs4 import BeautifulSoup
import requests
url = "https://qd.58.com/diannao/35200617992782x.shtml"
web_data = requests.get(url)
soup = BeautifulSoup(web_data.text, 'lxml')

title = soup.title.text
cost = soup.select("div#basicinfo span.infocard__container__item__main__text--price")
time = soup.select(".detail-title__info__text:nth-child(1)")
visitor = soup.select("span#totalcount")
area = soup.select("div.infocard__container__item:nth-child(3)>div.infocard__container__item__main")
who = soup.select("div.infocard__container__item:nth-child(4)>div.infocard__container__item__main")
data = {
   "title": title,
   "cost": cost[0].get_text().strip(),
   "time": time[0].get_text().strip(),
   "area": list(area[0].stripped_strings),
   "who": who[0].get_text().strip(),
   "visitor": visitor[0].get_text().strip()
}
print(data)
{'title': '现货400多台液晶电脑,低价出售,保修一年,可送货,李村附近,需要请联系! - 青岛58同城', 'cost': '350 元', 'time': '2018-08-23 发布', 'area': ['李沧', '-', '李村'], 'who': '李先生', 'visitor': '0'}

The fourth day of Crawler learning的更多相关文章

  1. The sixth day of Crawler learning

    爬取我爱竞赛网的大量数据 首先获取每一种比赛信息的分类链接 def get_type_url(url):    web_data = requests.get(web_url)    soup = B ...

  2. The fifth day of Crawler learning

    使用mongoDB 下载地址:https://www.mongodb.com/dr/fastdl.mongodb.org/win32/mongodb-win32-x86_64-2008plus-ssl ...

  3. The third day of Crawler learning

    连续爬取多页数据 分析每一页url的关联找出联系 例如虎扑 第一页:https://voice.hupu.com/nba/1 第二页:https://voice.hupu.com/nba/2 第三页: ...

  4. The second day of Crawler learning

    用BeatuifulSoup和Requests爬取猫途鹰网 服务器与本地的交换机制 我们每次浏览网页都是再向网页所在的服务器发送一个Request,然后服务器接受到Request后返回Response ...

  5. The first day of Crawler learning

    使用BeautifulSoup解析网页 Soup = BeautifulSoup(urlopen(html),'lxml') Soup为汤,html为食材,lxml为菜谱 from bs4 impor ...

  6. Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learning Project

    Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learn ...

  7. Machine and Deep Learning with Python

    Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...

  8. 深度学习论文笔记-Deep Learning Face Representation from Predicting 10,000 Classes

    来自:CVPR 2014   作者:Yi Sun ,Xiaogang Wang,Xiaoao Tang 题目:Deep Learning Face Representation from Predic ...

  9. Machine Learning for Developers

    Machine Learning for Developers Most developers these days have heard of machine learning, but when ...

随机推荐

  1. sql select时增加常量列

    阅读更多 string sql="select a,b,'常量' as c from table" 注:单引号' ' 很重要,否则编译时会把其看成查询参数,从而提示参数未指定错误. ...

  2. 云原生生态周报 Vol. 2

    摘要: Cloud Native Weekly China Vol. 2 业界要闻 Kubernetes External Secrets 近日,世界上最大的域名托管公司 Godaddy公司,正式宣布 ...

  3. 使用DECLARE定义条件和处理程序

    定义条件和处理程序是事先定义程序执行过程中可能遇到的问题,并且可以在处理程序中定义解决这些问题的办法,可以简单理解 为异常处理,这种方式可以提前预测可能出现的问题,并提出解决办法,从而增强程序健壮性. ...

  4. Lists and keys

    function NumberList(props) { const numbers = props.numbers; const listItems = numbers.map((number) = ...

  5. 在ThinkPHP中,if标签和比较标签对于变量的比较。

    在TP模板语言中.if和eq都可以用于变量的比较. <比较标签 name="变量" value="值">内容</比较标签> 比如: &l ...

  6. webstorm破解教程

    1.下载地址 官网:https://www.jetbrains.com/webstorm/ 下载好之后按照提示安装即可,这里就不再多说了.下面直接说说如何使用补丁破解. 2.使用补丁破解 (http: ...

  7. adam调参

    微调 #阿尔法 "learning_rate": 3e-5, #学习率衰减 "weight_decay": 0.1,// "weight_decay& ...

  8. 从浏览器的url中获取查询字符串的参数

    正则表达式: function getQuery(name){ var reg = new RegExp("(^|&)" + name + "=([^&] ...

  9. HTML--CSS样式表--基本概念(超链接的状态)

    样式表的基本概念 一.样式表的分类 1.内联样式表 和HTML联合显示,控制精确,但是可重用性差,冗余较多. 例:<p style="font-size:14px;"> ...

  10. UVa11400 - Lighting System Design——[动态规划]

    题干略. 题意分析: 很容易理解一类灯泡要么全部换要么全不换,其实费用节省的主要原因是由于替换灯泡类型而排除了低压电压源,于是我们就可以推断出灯泡类型替换的原则: 对于两类灯泡a1和a2,a1可以被a ...