The fourth day of Crawler learning
爬取58同城
from bs4 import BeautifulSoup
import requests
url = "https://qd.58.com/diannao/35200617992782x.shtml"
web_data = requests.get(url)
soup = BeautifulSoup(web_data.text, 'lxml')
title = soup.title.text
cost = soup.select("div#basicinfo span.infocard__container__item__main__text--price")
time = soup.select(".detail-title__info__text:nth-child(1)")
visitor = soup.select("span#totalcount")
area = soup.select("div.infocard__container__item:nth-child(3)>div.infocard__container__item__main")
who = soup.select("div.infocard__container__item:nth-child(4)>div.infocard__container__item__main")
data = {
"title": title,
"cost": cost[0].get_text().strip(),
"time": time[0].get_text().strip(),
"area": list(area[0].stripped_strings),
"who": who[0].get_text().strip(),
"visitor": visitor[0].get_text().strip()
}
print(data)
{'title': '现货400多台液晶电脑,低价出售,保修一年,可送货,李村附近,需要请联系! - 青岛58同城', 'cost': '350 元', 'time': '2018-08-23 发布', 'area': ['李沧', '-', '李村'], 'who': '李先生', 'visitor': '0'}
The fourth day of Crawler learning的更多相关文章
- The sixth day of Crawler learning
爬取我爱竞赛网的大量数据 首先获取每一种比赛信息的分类链接 def get_type_url(url): web_data = requests.get(web_url) soup = B ...
- The fifth day of Crawler learning
使用mongoDB 下载地址:https://www.mongodb.com/dr/fastdl.mongodb.org/win32/mongodb-win32-x86_64-2008plus-ssl ...
- The third day of Crawler learning
连续爬取多页数据 分析每一页url的关联找出联系 例如虎扑 第一页:https://voice.hupu.com/nba/1 第二页:https://voice.hupu.com/nba/2 第三页: ...
- The second day of Crawler learning
用BeatuifulSoup和Requests爬取猫途鹰网 服务器与本地的交换机制 我们每次浏览网页都是再向网页所在的服务器发送一个Request,然后服务器接受到Request后返回Response ...
- The first day of Crawler learning
使用BeautifulSoup解析网页 Soup = BeautifulSoup(urlopen(html),'lxml') Soup为汤,html为食材,lxml为菜谱 from bs4 impor ...
- Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learning Project
Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learn ...
- Machine and Deep Learning with Python
Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...
- 深度学习论文笔记-Deep Learning Face Representation from Predicting 10,000 Classes
来自:CVPR 2014 作者:Yi Sun ,Xiaogang Wang,Xiaoao Tang 题目:Deep Learning Face Representation from Predic ...
- Machine Learning for Developers
Machine Learning for Developers Most developers these days have heard of machine learning, but when ...
随机推荐
- android 数据存储----android短信发送器之文件的读写(手机+SD卡)
本文实践知识点有有三: 1.布局文件,android布局有相对布局,线性布局,绝对布局,表格布局,标签布局等.各个布局能够嵌套的.本文的布局文件就是线性布局的嵌套 <LinearLayout x ...
- Myeclipse jdk的安装
- Redis源码解析:04字典的遍历dictScan
dict.c中的dictScan函数,用来遍历字典,迭代其中的每个元素.该函数使用的算法非常精妙!!!所以必须记录一下. 遍历一个稳定的字典,当然不是什么难事,但Redis中的字典因为有rehash的 ...
- 使用属性position:fixed的时候如何才能让div居中
css: .aa{ position: fixed; top: 200px; left: 0px; right: 0px; width: 200px; height: 200px; margin-le ...
- jq制作tab栏
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...
- Python--day63--添加书籍
添加书籍的代码:
- 在Element节点上进行Xpath
XPathFactory xPathFactory = XPathFactory.newInstance(); XPath xpath = xPathFactory.newXPath(); try { ...
- [转]爬虫 selenium + phantomjs / chrome
目录 selenium 模块 安装 phantomjs 浏览器 安装 chromedriver 接口 安装 对比两个接口 整合使用 基本实例 常用属性方法 定位节点 节点操作 其他操作 实例解析 - ...
- 2019-8-2-WPF-从文件加载字体
title author date CreateTime categories WPF 从文件加载字体 lindexi 2019-08-02 17:10:33 +0800 2018-2-13 17:2 ...
- 【js】Vue 2.5.1 源码学习 (八)响应式入口observe
大体思路(七) 本节内容: deps 依赖收集的数组对象 => Dep 构造函数 /** ==> observe() * var ob * ==> if --isObject * = ...