爬取58同城

from bs4 import BeautifulSoup
import requests
url = "https://qd.58.com/diannao/35200617992782x.shtml"
web_data = requests.get(url)
soup = BeautifulSoup(web_data.text, 'lxml')

title = soup.title.text
cost = soup.select("div#basicinfo span.infocard__container__item__main__text--price")
time = soup.select(".detail-title__info__text:nth-child(1)")
visitor = soup.select("span#totalcount")
area = soup.select("div.infocard__container__item:nth-child(3)>div.infocard__container__item__main")
who = soup.select("div.infocard__container__item:nth-child(4)>div.infocard__container__item__main")
data = {
   "title": title,
   "cost": cost[0].get_text().strip(),
   "time": time[0].get_text().strip(),
   "area": list(area[0].stripped_strings),
   "who": who[0].get_text().strip(),
   "visitor": visitor[0].get_text().strip()
}
print(data)
{'title': '现货400多台液晶电脑,低价出售,保修一年,可送货,李村附近,需要请联系! - 青岛58同城', 'cost': '350 元', 'time': '2018-08-23 发布', 'area': ['李沧', '-', '李村'], 'who': '李先生', 'visitor': '0'}

The fourth day of Crawler learning的更多相关文章

  1. The sixth day of Crawler learning

    爬取我爱竞赛网的大量数据 首先获取每一种比赛信息的分类链接 def get_type_url(url):    web_data = requests.get(web_url)    soup = B ...

  2. The fifth day of Crawler learning

    使用mongoDB 下载地址:https://www.mongodb.com/dr/fastdl.mongodb.org/win32/mongodb-win32-x86_64-2008plus-ssl ...

  3. The third day of Crawler learning

    连续爬取多页数据 分析每一页url的关联找出联系 例如虎扑 第一页:https://voice.hupu.com/nba/1 第二页:https://voice.hupu.com/nba/2 第三页: ...

  4. The second day of Crawler learning

    用BeatuifulSoup和Requests爬取猫途鹰网 服务器与本地的交换机制 我们每次浏览网页都是再向网页所在的服务器发送一个Request,然后服务器接受到Request后返回Response ...

  5. The first day of Crawler learning

    使用BeautifulSoup解析网页 Soup = BeautifulSoup(urlopen(html),'lxml') Soup为汤,html为食材,lxml为菜谱 from bs4 impor ...

  6. Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learning Project

    Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learn ...

  7. Machine and Deep Learning with Python

    Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...

  8. 深度学习论文笔记-Deep Learning Face Representation from Predicting 10,000 Classes

    来自:CVPR 2014   作者:Yi Sun ,Xiaogang Wang,Xiaoao Tang 题目:Deep Learning Face Representation from Predic ...

  9. Machine Learning for Developers

    Machine Learning for Developers Most developers these days have heard of machine learning, but when ...

随机推荐

  1. android 数据存储----android短信发送器之文件的读写(手机+SD卡)

    本文实践知识点有有三: 1.布局文件,android布局有相对布局,线性布局,绝对布局,表格布局,标签布局等.各个布局能够嵌套的.本文的布局文件就是线性布局的嵌套 <LinearLayout x ...

  2. Myeclipse jdk的安装

  3. Redis源码解析:04字典的遍历dictScan

    dict.c中的dictScan函数,用来遍历字典,迭代其中的每个元素.该函数使用的算法非常精妙!!!所以必须记录一下. 遍历一个稳定的字典,当然不是什么难事,但Redis中的字典因为有rehash的 ...

  4. 使用属性position:fixed的时候如何才能让div居中

    css: .aa{ position: fixed; top: 200px; left: 0px; right: 0px; width: 200px; height: 200px; margin-le ...

  5. jq制作tab栏

    <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...

  6. Python--day63--添加书籍

    添加书籍的代码:

  7. 在Element节点上进行Xpath

    XPathFactory xPathFactory = XPathFactory.newInstance(); XPath xpath = xPathFactory.newXPath(); try { ...

  8. [转]爬虫 selenium + phantomjs / chrome

    目录 selenium 模块 安装 phantomjs 浏览器 安装 chromedriver 接口 安装 对比两个接口 整合使用 基本实例 常用属性方法 定位节点 节点操作 其他操作 实例解析 - ...

  9. 2019-8-2-WPF-从文件加载字体

    title author date CreateTime categories WPF 从文件加载字体 lindexi 2019-08-02 17:10:33 +0800 2018-2-13 17:2 ...

  10. 【js】Vue 2.5.1 源码学习 (八)响应式入口observe

    大体思路(七) 本节内容: deps 依赖收集的数组对象 => Dep 构造函数 /** ==> observe() * var ob * ==> if --isObject * = ...