Use BeautifulSoup and Python to scrap a website

Lib:

  • urllib
  • Parsing HTML Data

Web scraping script

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup quotes_page = "https://bluelimelearning.github.io/my-fav-quotes/"
uClient = uReq(quotes_page)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
quotes = page_soup.findAll("div", {"class":"quotes"}) for quote in quotes:
fav_quote = quote.findAll("p", {"class":"aquote"})
aquote = fav_quote[0].text.strip() fav_authors = quote.findAll("p",{"class":"author"})
author = fav_authors[0].text.strip() print(aquote)
print(author)

Run this script successfully

Following is the whole result of this scraping.

I hear and i forget. I see and i remember. I do and i understand.
Confucious
Feeling gratitude and not expressing it is like wrapping a present and not giving it.
William Arthur Ward
Our greatest glory is not in never falling but in rising every time we fall.
Confucious
The secret of getting aheadis getting started.
Mark Twain
Believe you can and you're halfway there.
Theodore Roosevelt
Resentment is like drinking Poison and waiting for your enemies to die.
Nelson Mandela
Silence is a true friend who never betrays.
Confucius
The best way to find yourself is to lose yourself in the service of others.
Mahatma Gandhi
Never succumb to the temptation of bitterness.
Martin Luther King Jnr
The journey of a thousand miles begins with one step.
Lao Tzu
It is health that is real wealth and not pieces of gold and silver.
Mahatma Gandhi
Yesterday is not ours to recover but tomorrow is ours to win or lose.
Lyndon B Johnson
It's not what happens to you but how you react to it that matters .
Epictetus
Beware of what you become in pursuit of what you want.
Jim Rohn
The best revenge is massive success.
Frank Sinatra
Do not take life too seriously You will never get out of it alive.
Elbert Hubbard
Don't judge each day by the harvest you reap but by the seeds that yiu plant.
Robert Loius Stevenson
Your attitude and not your aptitude will determine your altitude
Zig Ziglar
Imagination is more important than knowledge.
Albert Einstein

.

Web Scraping using Python Scrapy_BS4 - using BeautifulSoup and Python的更多相关文章

  1. Web Scraping using Python Scrapy_BS4 - using Scrapy and Python(2)

    Scrapy Architecture Creating a Spider. Spiders are classes that you define that Scrapy uses to scrap ...

  2. Web Scraping using Python Scrapy_BS4 - using Scrapy and Python(1)

    Create a new Scrapy project first. scrapy startproject projectName . Open this project in Visual Stu ...

  3. Web Scraping using Python Scrapy_BS4 - Software

    Install the following software before web scraping. Visual Studio Code Python and Pip pip install vi ...

  4. Web Scraping using Python Scrapy_BS4 - Introduction

    What is Web Scraping This is also referred to as web harvesting and web data extraction. This is the ...

  5. Web Scraping with Python读书笔记及思考

    Web Scraping with Python读书笔记 标签(空格分隔): web scraping ,python 做数据抓取一定一定要明确:抓取\解析数据不是目的,目的是对数据的利用 一般的数据 ...

  6. <Web Scraping with Python>:Chapter 1 & 2

    <Web Scraping with Python> Chapter 1 & 2: Your First Web Scraper & Advanced HTML Parsi ...

  7. 《Web Scraping With Python》Chapter 2的学习笔记

    You Don't Always Need a Hammer When Michelangelo was asked how he could sculpt a work of art as mast ...

  8. 阅读OReilly.Web.Scraping.with.Python.2015.6笔记---Crawl

    阅读OReilly.Web.Scraping.with.Python.2015.6笔记---Crawl 1.函数调用它自身,这样就形成了一个循环,一环套一环: from urllib.request ...

  9. 阅读OReilly.Web.Scraping.with.Python.2015.6笔记---找出网页中所有的href

    阅读OReilly.Web.Scraping.with.Python.2015.6笔记---找出网页中所有的href 1.查找以<a>开头的所有文本,然后判断href是否在<a> ...

随机推荐

  1. 四分位数与pandas中的quantile函数

    四分位数与pandas中的quantile函数 1.分位数概念 统计学上的有分位数这个概念,一般用p来表示.原则上p是可以取0到1之间的任意值的.但是有一个四分位数是p分位数中较为有名的. 所谓四分位 ...

  2. 这篇文章,我们来谈一谈Spring中的属性注入

    本系列文章: 读源码,我们可以从第一行读起 你知道Spring是怎么解析配置类的吗? 配置类为什么要添加@Configuration注解? 谈谈Spring中的对象跟Bean,你知道Spring怎么创 ...

  3. Java垃圾回收机制(GC)

    Java内存分配机制 这里所说的内存分配,主要指的是在堆上的分配,一般的,对象的内存分配都是在堆上进行,但现代技术也支持将对象拆成标量类型(标量类型即原子类型,表示单个值,可以是基本类型或String ...

  4. SSH网上商城二

    1.实现的功能如下 当用户登陆成功之后,在首页显示所有的一级分类 显示热门商品 显示最新商品 当用户点击某个一级分类的菜单选项的时候,显示当前一级分类菜单项下所有的二级分类,并且按照分页的形式显示该二 ...

  5. 【Spring】@Transactional 闲聊

    菜瓜:上次的AOP理论知识看完收获挺多的,虽然有一个自定义注解的demo,但还是觉得差点东西 水稻:我也觉得没有跟一遍源码还是差点意思,这次结合@Transactional注解深入源码看一下 菜瓜:事 ...

  6. JNI调用Cython生成库‘undefined symbol: PyInit_’问题

    最近项目需要提升所有 Python 算法的执行时间,并给 Java 框架调用,根据 Python一键转Jar包,Java调用Python新姿势!的思路可以用 Cython 将 Python 代码转换为 ...

  7. java语言基础(四)_面向对象_类_对象_封装_构造

    面向对象 Java语言是一种面向对象的程序设计语言,而面向对象思想是一种程序设计思想,我们在面向对象思想的指引下,使用Java语言去设计.开发计算机程序. 这里的对象泛指现实中一切事物,每种事物都具备 ...

  8. java实现在一个字符串中查找某个子字符串出现的次数

    public static void main(String[] args) { String a = "我爱我的祖国!!!"; String b = "爱"; ...

  9. PHP实现邮箱验证码验证功能

    *文章来源:https://blog.egsec.cn/archives/623  (我的主站) *本文将主要说明:PHP实现邮箱验证码验证功能,通过注册或登录向用户发送身份确认验证码,并通过判断输入 ...

  10. css自动省略号...,通过css实现单行、多行文本溢出显示省略号

    网页开发过程中经常会遇到需要把多行文字溢出显示省略号,这篇文章将总结通过多种方法实现文本末尾省略号显示. 一.单行文本溢出显示省略号(…) 省略号在ie中可以使用text-overflow:ellip ...