初识selenium

selenium最初是一个自动化测试工具,而爬虫中使用它主要是为了解决requests无法直接执行JavaScript代码的问题

selenium本质是通过驱动浏览器,完全模拟浏览器的操作,比如跳转、输入、点击、下拉等,来拿到网页渲染之后的结果,可支持多种浏览器

声明浏览器对象

from selenium import webdriver

browser = webdriver.Chrome()
browser = webdriver.Firefox()
browser = webdriver.Edge()
browser = webdriver.PhantomJS()
browser = webdriver.Safari()

安装

#安装:selenium+chromedriver
pip3 install selenium
下载chromdriver.exe放到python安装路径的scripts目录中即可
国内镜像网站地址:http://npm.taobao.org/mirrors/chromedriver/2.29/
最新的版本去官网找:https://sites.google.com/a/chromium.org/chromedriver/downloads #安装:selenium+phantomjs
pip3 install selenium
下载phantomjs,解压后把phantomjs.exe所在的bin目录放到环境变量
下载链接:http://phantomjs.org/download.html #注意:
selenium3默认支持的webdriver是Firfox,而Firefox需要安装geckodriver
下载链接:https://github.com/mozilla/geckodriver/releases

基本使用

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait browser = webdriver.Chrome()
try:
browser.get('https://www.baidu.com')
input = browser.find_element_by_id('kw')
input.send_keys('Python')
input.send_keys(Keys.ENTER)
wait = WebDriverWait(browser, 10)
wait.until(EC.presence_of_element_located((By.ID, 'content_left')))
print(browser.current_url)
print(browser.get_cookies())
print(browser.page_source)
finally:
browser.close()

访问页面

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
print(browser.page_source)
browser.close()

查找元素

单个元素

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input_first = browser.find_element_by_id('q')
input_second = browser.find_element_by_css_selector('#q')
input_third = browser.find_element_by_xpath('//*[@id="q"]')
print(input_first, input_second, input_third)
browser.close()
<selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")> <selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")> <selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")>

output

  • find_element_by_name
  • find_element_by_xpath
  • find_element_by_link_text
  • find_element_by_partial_link_text
  • find_element_by_tag_name
  • find_element_by_class_name
  • find_element_by_css_selector
from selenium import webdriver
from selenium.webdriver.common.by import By browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input_first = browser.find_element(By.ID, 'q')
print(input_first)
browser.close()
<selenium.webdriver.remote.webelement.WebElement (session="1f209c0d11551c40d9d20ad964fef244", element="0.07914603542731591-1")>

多个元素

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
lis = browser.find_elements_by_css_selector('.service-bd li')
print(lis)
browser.close()
[<selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-1")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-2")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-3")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-4")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-5")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-6")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-7")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-8")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-9")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-10")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-11")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-12")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-13")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-14")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-15")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-16")>]

output

from selenium import webdriver
from selenium.webdriver.common.by import By browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
lis = browser.find_elements(By.CSS_SELECTOR, '.service-bd li')
print(lis)
browser.close()
[<selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-1")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-2")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-3")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-4")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-5")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-6")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-7")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-8")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-9")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-10")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-11")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-12")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-13")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-14")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-15")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-16")>]

output

  • find_elements_by_name
  • find_elements_by_xpath
  • find_elements_by_link_text
  • find_elements_by_partial_link_text
  • find_elements_by_tag_name
  • find_elements_by_class_name
  • find_elements_by_css_selector

元素互交操作

对获取的元素调用交互方法

from selenium import webdriver
import time browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input = browser.find_element_by_id('q')
input.send_keys('iPhone')
time.sleep(1)
input.clear()
input.send_keys('iPad')
button = browser.find_element_by_class_name('btn-search')
button.click()

更多操作: http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.remote.webelement

交互动作

将动作附加到动作链中串行执行

from selenium import webdriver
from selenium.webdriver import ActionChains browser = webdriver.Chrome()
url = 'http://www.runoob.com/try/try.php?filename=jqueryui-api-droppable'
browser.get(url)
browser.switch_to.frame('iframeResult')
source = browser.find_element_by_css_selector('#draggable')
target = browser.find_element_by_css_selector('#droppable')
actions = ActionChains(browser)
actions.drag_and_drop(source, target)
actions.perform()

更多操作: http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.common.action_chains

执行JavaScript

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.zhihu.com/explore')
browser.execute_script('window.scrollTo(0, document.body.scrollHeight)')
browser.execute_script('alert("To Bottom")')

获取元素信息

获取属性

from selenium import webdriver
from selenium.webdriver import ActionChains browser = webdriver.Chrome()
url = 'https://www.zhihu.com/explore'
browser.get(url)
logo = browser.find_element_by_id('zh-top-link-logo')
print(logo)
print(logo.get_attribute('class'))
<selenium.webdriver.remote.webelement.WebElement (session="5a9e00352fbb1bb3df8b81a9f666cba9", element="0.7456057855126614-1")>
zu-top-link-logo

output

获取文本

from selenium import webdriver

browser = webdriver.Chrome()
url = 'https://www.zhihu.com/explore'
browser.get(url)
input = browser.find_element_by_class_name('zu-top-add-question')
print(input.text)
提问

获取ID、位置、标签名、大小

from selenium import webdriver

browser = webdriver.Chrome()
url = 'https://www.zhihu.com/explore'
browser.get(url)
input = browser.find_element_by_class_name('zu-top-add-question')
print(input.id)
print(input.location)
print(input.tag_name)
print(input.size)
0.6822924344980397-
{'y': , 'x': }
button
{'height': , 'width': }

Frame

import time
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException browser = webdriver.Chrome()
url = 'http://www.runoob.com/try/try.php?filename=jqueryui-api-droppable'
browser.get(url)
browser.switch_to.frame('iframeResult')
source = browser.find_element_by_css_selector('#draggable')
print(source)
try:
logo = browser.find_element_by_class_name('logo')
except NoSuchElementException:
print('NO LOGO')
browser.switch_to.parent_frame()
logo = browser.find_element_by_class_name('logo')
print(logo)
print(logo.text)
<selenium.webdriver.remote.webelement.WebElement (session="4bb8ac03ced4ecbdefef03ffdc0e4ccd", element="0.44746093888932004-1")>
NO LOGO
<selenium.webdriver.remote.webelement.WebElement (session="4bb8ac03ced4ecbdefef03ffdc0e4ccd", element="0.13792611320464965-2")>
RUNOOB.COM

output

等待

隐式等待

当使用了隐式等待执行测试的时候,如果 WebDriver没有在 DOM中找到元素,将继续等待,超出设定时间后则抛出找不到元素的异常, 换句话说,当查找元素或元素并没有立即出现的时候,隐式等待将等待一段时间再查找 DOM,默认的时间是0

from selenium import webdriver

browser = webdriver.Chrome()
browser.implicitly_wait(10)
browser.get('https://www.zhihu.com/explore')
input = browser.find_element_by_class_name('zu-top-add-question')
print(input)
<selenium.webdriver.remote.webelement.WebElement (session="b29214772d59e912f1ac52e96ed29abe", element="0.12886805191194894-1")>

output

显示等待

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC browser = webdriver.Chrome()
browser.get('https://www.taobao.com/')
wait = WebDriverWait(browser, 10)
input = wait.until(EC.presence_of_element_located((By.ID, 'q')))
button = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.btn-search')))
print(input, button)
<selenium.webdriver.remote.webelement.WebElement (session="07dd2fbc2d5b1ce40e82b9754aba8fa8", element="0.5642646294074107-1")> <selenium.webdriver.remote.webelement.WebElement (session="07dd2fbc2d5b1ce40e82b9754aba8fa8", element="0.5642646294074107-2")>

output

  • title_is 标题是某内容
  • title_contains 标题包含某内容
  • presence_of_element_located 元素加载出,传入定位元组,如(By.ID, 'p')
  • visibility_of_element_located 元素可见,传入定位元组
  • visibility_of 可见,传入元素对象
  • presence_of_all_elements_located 所有元素加载出
  • text_to_be_present_in_element 某个元素文本包含某文字
  • text_to_be_present_in_element_value 某个元素值包含某文字
  • frame_to_be_available_and_switch_to_it frame加载并切换
  • invisibility_of_element_located 元素不可见
  • element_to_be_clickable 元素可点击
  • staleness_of 判断一个元素是否仍在DOM,可判断页面是否已经刷新
  • element_to_be_selected 元素可选择,传元素对象
  • element_located_to_be_selected 元素可选择,传入定位元组
  • element_selection_state_to_be 传入元素对象以及状态,相等返回True,否则返回False
  • element_located_selection_state_to_be 传入定位元组以及状态,相等返回True,否则返回False
  • alert_is_present 是否出现Alert

详细内容:http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.support.expected_conditions

前进后退

import time
from selenium import webdriver browser = webdriver.Chrome()
browser.get('https://www.baidu.com/')
browser.get('https://www.taobao.com/')
browser.get('https://www.python.org/')
browser.back()
time.sleep(1)
browser.forward()
browser.close()

Cookies

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.zhihu.com/explore')
print(browser.get_cookies())
browser.add_cookie({'name': 'name', 'domain': 'www.zhihu.com', 'value': 'germey'})
print(browser.get_cookies())
browser.delete_all_cookies()
print(browser.get_cookies())
[{'secure': False, 'value': '"NGM0ZTM5NDAwMWEyNDQwNDk5ODlkZWY3OTkxY2I0NDY=|1491604091|236e34290a6f407bfbb517888849ea509ac366d0"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'l_cap_id', 'expiry': 1494196091.403418}, {'secure': False, 'value': '"YWEyOGY4MmI1MzQ2NGY5MmFiMjgzZGUzZWJjYTgwYjY=|1491604091|ff946847ddb5881245bdb7a5e6401b70dc61013f"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'cap_id', 'expiry': 1494196091.402855}, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'l_n_c'}, {'secure': False, 'value': '"MjcxMDE3YzU1YjI4NDljZjljNTQ4ZDIyOWJjZTBhNmY=|1491604091|8da4722b56a1545c2020dba97394a220c0eca8d9"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'r_cap_id', 'expiry': 1494196091.402525}, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmc'}, {'secure': False, 'value': '"AADCo7e1kguPTqvEOMieRUzwkA7ZUBhV-VY=|1491604091"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'd_c0', 'expiry': 1586212091.344773}, {'secure': False, 'value': '51854390.1491604091.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmz', 'expiry': }, {'secure': False, 'value': '3cc99fc5-8706-43fc-90ac-3ad991bd1a25', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '_zap', 'expiry': }, {'secure': False, 'value': '97cb00128ccb46659728f7c69cc191b0|1491604091000|1491604091000', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'q_c1', 'expiry': 1586212091.401644}, {'secure': False, 'value': '51854390.2.10.1491604091', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmb', 'expiry': }, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'n_c'}, {'secure': False, 'value': '51854390.000--|3=entry_date=20170408=1', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmv', 'expiry': }, {'secure': False, 'value': '51854390.669300758.1491604091.1491604091.1491604091.1', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utma', 'expiry': }, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmt', 'expiry': }, {'secure': False, 'value': 'AQAAALCZuwh+dgIAeu3PPHA+csDPnXvT', 'domain': 'www.zhihu.com', 'path': '/', 'httpOnly': True, 'name': 'aliyungf_tc'}]
[{'secure': False, 'value': 'germey', 'domain': '.www.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'name'}, {'secure': False, 'value': '"NGM0ZTM5NDAwMWEyNDQwNDk5ODlkZWY3OTkxY2I0NDY=|1491604091|236e34290a6f407bfbb517888849ea509ac366d0"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'l_cap_id', 'expiry': 1494196091.403418}, {'secure': False, 'value': '"YWEyOGY4MmI1MzQ2NGY5MmFiMjgzZGUzZWJjYTgwYjY=|1491604091|ff946847ddb5881245bdb7a5e6401b70dc61013f"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'cap_id', 'expiry': 1494196091.402855}, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'l_n_c'}, {'secure': False, 'value': '"MjcxMDE3YzU1YjI4NDljZjljNTQ4ZDIyOWJjZTBhNmY=|1491604091|8da4722b56a1545c2020dba97394a220c0eca8d9"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'r_cap_id', 'expiry': 1494196091.402525}, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmc'}, {'secure': False, 'value': '"AADCo7e1kguPTqvEOMieRUzwkA7ZUBhV-VY=|1491604091"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'd_c0', 'expiry': 1586212091.344773}, {'secure': False, 'value': '51854390.1491604091.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmz', 'expiry': }, {'secure': False, 'value': '3cc99fc5-8706-43fc-90ac-3ad991bd1a25', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '_zap', 'expiry': }, {'secure': False, 'value': '97cb00128ccb46659728f7c69cc191b0|1491604091000|1491604091000', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'q_c1', 'expiry': 1586212091.401644}, {'secure': False, 'value': '51854390.2.10.1491604091', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmb', 'expiry': }, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'n_c'}, {'secure': False, 'value': '51854390.000--|3=entry_date=20170408=1', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmv', 'expiry': }, {'secure': False, 'value': '51854390.669300758.1491604091.1491604091.1491604091.1', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utma', 'expiry': }, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmt', 'expiry': }, {'secure': False, 'value': 'AQAAALCZuwh+dgIAeu3PPHA+csDPnXvT', 'domain': 'www.zhihu.com', 'path': '/', 'httpOnly': True, 'name': 'aliyungf_tc'}]
[]

output

选项卡管理

import time
from selenium import webdriver browser = webdriver.Chrome()
browser.get('https://www.baidu.com')
browser.execute_script('window.open()')
print(browser.window_handles)
browser.switch_to_window(browser.window_handles[1])
browser.get('https://www.taobao.com')
time.sleep(1)
browser.switch_to_window(browser.window_handles[0])
browser.get('https://python.org')
['CDwindow-4f58e3a7-7167-4587-bedf-9cd8c867f435', 'CDwindow-6e05f076-6d77-453a-a36c-32baacc447df']

output

异常处理

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.baidu.com')
browser.find_element_by_id('hello')
NoSuchElementException                    Traceback (most recent call last)
<ipython-input--978945848a1b> in <module>()
browser = webdriver.Chrome()
browser.get('https://www.baidu.com')
----> browser.find_element_by_id('hello') /Library/Frameworks/Python.framework/Versions/3.5/lib/python3./site-packages/selenium/webdriver/remote/webdriver.py in find_element_by_id(self, id_)
driver.find_element_by_id('foo')
"""
--> return self.find_element(by=By.ID, value=id_) def find_elements_by_id(self, id_): /Library/Frameworks/Python.framework/Versions/3.5/lib/python3./site-packages/selenium/webdriver/remote/webdriver.py in find_element(self, by, value)
return self.execute(Command.FIND_ELEMENT, {
'using': by,
--> 'value': value})['value'] def find_elements(self, by=By.ID, value=None): /Library/Frameworks/Python.framework/Versions/3.5/lib/python3./site-packages/selenium/webdriver/remote/webdriver.py in execute(self, driver_command, params)
response = self.command_executor.execute(driver_command, params)
if response:
--> self.error_handler.check_response(response)
response['value'] = self._unwrap_value(
response.get('value', None)) /Library/Frameworks/Python.framework/Versions/3.5/lib/python3./site-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
elif exception_class == UnexpectedAlertPresentException and 'alert' in value:
raise exception_class(message, screen, stacktrace, value['alert'].get('text'))
--> raise exception_class(message, screen, stacktrace) def _value_or_default(self, obj, key, default): NoSuchElementException: Message: no such element: Unable to locate element: {"method":"id","selector":"hello"}
(Session info: chrome=57.0.2987.133)
(Driver info: chromedriver=2.27. (e97a722caafc2d3a8b807ee115bfb307f7d2cfd9),platform=Mac OS X 10.12. x86_64)

异常

from selenium import webdriver
from selenium.common.exceptions import TimeoutException, NoSuchElementException browser = webdriver.Chrome()
try:
browser.get('https://www.baidu.com')
except TimeoutException:
print('Time Out')
try:
browser.find_element_by_id('hello')
except NoSuchElementException:
print('No Element')
finally:
browser.close()
No Element

详细文档:http://selenium-python.readthedocs.io/api.html#module-selenium.common.exceptions

selenium基础用法(爬虫)的更多相关文章

  1. 【Python爬虫】selenium基础用法

    selenium 基础用法 阅读目录 初识selenium 基本使用 查找元素 元素互交操作 执行JavaScript 获取元素信息 等待 前进后退 Cookies 选项卡管理 异常处理 初识sele ...

  2. 爬虫简介、requests 基础用法、urlretrieve()

    1. 爬虫简介 2. requests 基础用法 3. urlretrieve() 1. 爬虫简介 爬虫的定义 网络爬虫(又被称为网页蜘蛛.网络机器人),是一种按照一定的规则,自动地抓取万维网信息的程 ...

  3. 使用Python + Selenium打造浏览器爬虫

    Selenium 是一款强大的基于浏览器的开源自动化测试工具,最初由 Jason Huggins 于 2004 年在 ThoughtWorks 发起,它提供了一套简单易用的 API,模拟浏览器的各种操 ...

  4. Python+Selenium基础入门及实践

    Python+Selenium基础入门及实践 32018.08.29 11:21:52字数 3220阅读 23422 一.Selenium+Python环境搭建及配置 1.1 selenium 介绍 ...

  5. asyncio 基础用法

    asyncio 基础用法 python也是在python 3.4中引入了协程的概念.也通过这次整理更加深刻理解这个模块的使用 asyncio 是干什么的? asyncio是Python 3.4版本引入 ...

  6. PropertyGrid控件由浅入深(二):基础用法

    目录 PropertyGrid控件由浅入深(一):文章大纲 PropertyGrid控件由浅入深(二):基础用法 控件的外观构成 控件的外观构成如下图所示: PropertyGrid控件包含以下几个要 ...

  7. logstash安装与基础用法

    若是搭建elk,建议先安装好elasticsearch 来自官网,版本为2.3 wget -c https://download.elastic.co/logstash/logstash/packag ...

  8. elasticsearch安装与基础用法

    来自官网,版本为2.3 注意elasticsearch依赖jdk,2.3依赖jdk7 下载rpm包并安装 wget -c https://download.elastic.co/elasticsear ...

  9. BigDecimal最基础用法

    BigDecimal最基础用法 用字符串生成的BigDecimal是不会丢精度的. 简单除法. public class DemoBigDecimal { public static void mai ...

随机推荐

  1. vxlan中vtep角色,以及通过GRE隧道进行流镜像

    1. 交换机上建立gre隧道,对端ip为ip12. 交换机上报gre隧道的OF逻辑端口port id,这里gre tunnel的id实际就是OF逻辑端口id3. 控制器建立流ipflow1的镜像配置, ...

  2. 3、VNC

    VNC(Virtual Network Computing,虚拟网络计算机) VNC分为两部分组成:VNC server 和 VNC viewer VNC安装 1.yum install tigerv ...

  3. python:assert

    assert 断言 使用assert断言是个好习惯 在没完善一个程序之前,我们不知道程序在哪里会出错,与其让它在运行最崩溃,不如在出现错误条件时就崩溃,这时候就需要assert断言的帮助. asser ...

  4. html5(七) Web存储

    http://www.cnblogs.com/stoneniqiu/p/4206796.html http://www.cnblogs.com/v10258/p/3700486.html html5中 ...

  5. Dropout, DropConnect ——一个对输出,一个对输入

    Deep learning: Dropout, DropConnect from:https://www.jianshu.com/p/b349c4c82da3 Dropout 训练神经网络模型时,如果 ...

  6. 精选!15 个必备的 VSCode 插件(前端类)

      精选!15 个必备的 VSCode 插件(前端类)   就像大多数 IDE 一样,VSCode 也有一个扩展和主题市场,包含了数以千计质量不同的插件.为了帮助大家挑选出值得下载的插件,我们针对性的 ...

  7. SpringCloud服务负载均衡实现原理02

  8. Win10系列:C#应用控件进阶3

    椭圆 若要绘制椭圆需要用到Ellipse元素,通过指定Ellipse元素的Width和Height属性值来确定椭圆的大小,其中Width指椭圆在X轴的宽度,Height指椭圆在Y轴的高度,若X轴和Y轴 ...

  9. Android : 跟我学Binder --- (3) C程序示例

    目录: Android : 跟我学Binder --- (1) 什么是Binder IPC?为何要使用Binder机制? Android : 跟我学Binder --- (2) AIDL分析及手动实现 ...

  10. jmeter性能测试的小小实践

    项目描述: 被测试网站: www.cnblogs.com 指标:响应时间及错误率 场景:线程数--20: Ramp-Up period(in second 10)--10: 循环次数--10 测试步骤 ...