Web Scraping using Python Scrapy_BS4 - using BeautifulSoup and Python
Use BeautifulSoup and Python to scrap a website
Lib:
- urllib
- Parsing HTML Data
Web scraping script
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup quotes_page = "https://bluelimelearning.github.io/my-fav-quotes/"
uClient = uReq(quotes_page)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
quotes = page_soup.findAll("div", {"class":"quotes"}) for quote in quotes:
fav_quote = quote.findAll("p", {"class":"aquote"})
aquote = fav_quote[0].text.strip() fav_authors = quote.findAll("p",{"class":"author"})
author = fav_authors[0].text.strip() print(aquote)
print(author)
Run this script successfully

Following is the whole result of this scraping.
I hear and i forget. I see and i remember. I do and i understand.
Confucious
Feeling gratitude and not expressing it is like wrapping a present and not giving it.
William Arthur Ward
Our greatest glory is not in never falling but in rising every time we fall.
Confucious
The secret of getting aheadis getting started.
Mark Twain
Believe you can and you're halfway there.
Theodore Roosevelt
Resentment is like drinking Poison and waiting for your enemies to die.
Nelson Mandela
Silence is a true friend who never betrays.
Confucius
The best way to find yourself is to lose yourself in the service of others.
Mahatma Gandhi
Never succumb to the temptation of bitterness.
Martin Luther King Jnr
The journey of a thousand miles begins with one step.
Lao Tzu
It is health that is real wealth and not pieces of gold and silver.
Mahatma Gandhi
Yesterday is not ours to recover but tomorrow is ours to win or lose.
Lyndon B Johnson
It's not what happens to you but how you react to it that matters .
Epictetus
Beware of what you become in pursuit of what you want.
Jim Rohn
The best revenge is massive success.
Frank Sinatra
Do not take life too seriously You will never get out of it alive.
Elbert Hubbard
Don't judge each day by the harvest you reap but by the seeds that yiu plant.
Robert Loius Stevenson
Your attitude and not your aptitude will determine your altitude
Zig Ziglar
Imagination is more important than knowledge.
Albert Einstein
.
Web Scraping using Python Scrapy_BS4 - using BeautifulSoup and Python的更多相关文章
- Web Scraping using Python Scrapy_BS4 - using Scrapy and Python(2)
Scrapy Architecture Creating a Spider. Spiders are classes that you define that Scrapy uses to scrap ...
- Web Scraping using Python Scrapy_BS4 - using Scrapy and Python(1)
Create a new Scrapy project first. scrapy startproject projectName . Open this project in Visual Stu ...
- Web Scraping using Python Scrapy_BS4 - Software
Install the following software before web scraping. Visual Studio Code Python and Pip pip install vi ...
- Web Scraping using Python Scrapy_BS4 - Introduction
What is Web Scraping This is also referred to as web harvesting and web data extraction. This is the ...
- Web Scraping with Python读书笔记及思考
Web Scraping with Python读书笔记 标签(空格分隔): web scraping ,python 做数据抓取一定一定要明确:抓取\解析数据不是目的,目的是对数据的利用 一般的数据 ...
- <Web Scraping with Python>:Chapter 1 & 2
<Web Scraping with Python> Chapter 1 & 2: Your First Web Scraper & Advanced HTML Parsi ...
- 《Web Scraping With Python》Chapter 2的学习笔记
You Don't Always Need a Hammer When Michelangelo was asked how he could sculpt a work of art as mast ...
- 阅读OReilly.Web.Scraping.with.Python.2015.6笔记---Crawl
阅读OReilly.Web.Scraping.with.Python.2015.6笔记---Crawl 1.函数调用它自身,这样就形成了一个循环,一环套一环: from urllib.request ...
- 阅读OReilly.Web.Scraping.with.Python.2015.6笔记---找出网页中所有的href
阅读OReilly.Web.Scraping.with.Python.2015.6笔记---找出网页中所有的href 1.查找以<a>开头的所有文本,然后判断href是否在<a> ...
随机推荐
- yii2.0数据库操作
User::find()->all(); 此方法返回所有数据: User::findOne($id); 此方法返回 主键 id=1 的一条数据(举个例子): User::find()->w ...
- JAVA设计模式 1 设计模式介绍、单例模式的理解与使用
数据结构我们已经学了一部分了.是该了解了解设计模式了.习惯了CRUD的你,也该了解了解这一门神器.我为啥要说是神器呢? 因为在大厂的面试环节.以及很多的比如 Springboot Mybatis 等开 ...
- Python学习日志-02
(2)Python如何运行程序 Python解释器简介: Python不仅仅是一门编程语言,它也是一个名为解释器的软件包.解释器是一种让其他程序运行起来的程序.当你编写了一段Python程序,Pyth ...
- springboot mybatis plus多数据源轻松搞定 (上)
在开发中经常会遇到一个程序需要调用多个数据库的情况,总得来说分为下面的几种情况: 一个程序会调用不同结构的两个数据库. 读写分离,两个数据结构可能一样高,但是不同的操作针对不同的数据库. 混合情况,既 ...
- 本地代码提交到远程仓库(git)
[准备环境] 我没有在Linux搭建gitlab私有云服务器,用的是开源的 gitee托管平台 1.在gitee注册账号 2.本地下载git客户端 [步骤] 1 本地新建1个文件夹 进入文件夹后 ...
- Redis源码阅读一:简单动态字符串SDS
源码阅读基于Redis4.0.9 SDS介绍 redis 127.0.0.1:6379> SET dbname redis OK redis 127.0.0.1:6379> GET dbn ...
- WeChair项目Alpha冲刺(1/10)
团队项目进行情况 1.昨日进展 因为是Alpha冲刺第一天,所以昨日进展无 2.今日安排 前端:完成前端页面的首页html+css部分 后端:搭建好SpringBoot项目以及完成实体类代码的编 ...
- FastJson对实体类和Json还有JSONObject相互转换
1. 实体类或集合转JSON串 String besnString = JSONObject.toJSONString(实体类); 2.JSON串转JSONObject JSONObject json ...
- XmlHttpRequest使用及“跨域”问题解决
一. IE7以后对xmlHttpRequest 对象的创建在不同浏览器上是兼容的. 下面的方法是考虑兼容性的,实际项目中一般使用Jquery的ajax请求,可以不考虑兼容性问题. function g ...
- Java XML文件解析
四种生成和解析XML文档的方法详解(介绍+优缺点比较+示例) 蓝色字体内容由上一篇博文中补充进来的,写作风格比较好,大家有兴趣可以去查看原文 众所周知,现在解析XML的方法越来越多,但主流的方法也就四 ...