初识selenium

selenium最初是一个自动化测试工具,而爬虫中使用它主要是为了解决requests无法直接执行JavaScript代码的问题

selenium本质是通过驱动浏览器,完全模拟浏览器的操作,比如跳转、输入、点击、下拉等,来拿到网页渲染之后的结果,可支持多种浏览器

声明浏览器对象

from selenium import webdriver

browser = webdriver.Chrome()
browser = webdriver.Firefox()
browser = webdriver.Edge()
browser = webdriver.PhantomJS()
browser = webdriver.Safari()

安装

#安装:selenium+chromedriver
pip3 install selenium
下载chromdriver.exe放到python安装路径的scripts目录中即可
国内镜像网站地址:http://npm.taobao.org/mirrors/chromedriver/2.29/
最新的版本去官网找:https://sites.google.com/a/chromium.org/chromedriver/downloads #安装:selenium+phantomjs
pip3 install selenium
下载phantomjs,解压后把phantomjs.exe所在的bin目录放到环境变量
下载链接:http://phantomjs.org/download.html #注意:
selenium3默认支持的webdriver是Firfox,而Firefox需要安装geckodriver
下载链接:https://github.com/mozilla/geckodriver/releases

基本使用

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait browser = webdriver.Chrome()
try:
browser.get('https://www.baidu.com')
input = browser.find_element_by_id('kw')
input.send_keys('Python')
input.send_keys(Keys.ENTER)
wait = WebDriverWait(browser, 10)
wait.until(EC.presence_of_element_located((By.ID, 'content_left')))
print(browser.current_url)
print(browser.get_cookies())
print(browser.page_source)
finally:
browser.close()

访问页面

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
print(browser.page_source)
browser.close()

查找元素

单个元素

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input_first = browser.find_element_by_id('q')
input_second = browser.find_element_by_css_selector('#q')
input_third = browser.find_element_by_xpath('//*[@id="q"]')
print(input_first, input_second, input_third)
browser.close()
<selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")> <selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")> <selenium.webdriver.remote.webelement.WebElement (session="5e53d9e1c8646e44c14c1c2880d424af", element="0.5649563096161541-1")>

output

  • find_element_by_name
  • find_element_by_xpath
  • find_element_by_link_text
  • find_element_by_partial_link_text
  • find_element_by_tag_name
  • find_element_by_class_name
  • find_element_by_css_selector
from selenium import webdriver
from selenium.webdriver.common.by import By browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input_first = browser.find_element(By.ID, 'q')
print(input_first)
browser.close()
<selenium.webdriver.remote.webelement.WebElement (session="1f209c0d11551c40d9d20ad964fef244", element="0.07914603542731591-1")>

多个元素

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
lis = browser.find_elements_by_css_selector('.service-bd li')
print(lis)
browser.close()
[<selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-1")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-2")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-3")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-4")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-5")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-6")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-7")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-8")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-9")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-10")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-11")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-12")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-13")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-14")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-15")>, <selenium.webdriver.remote.webelement.WebElement (session="c26290835d4457ebf7d96bfab3740d19", element="0.09221044033125603-16")>]

output

from selenium import webdriver
from selenium.webdriver.common.by import By browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
lis = browser.find_elements(By.CSS_SELECTOR, '.service-bd li')
print(lis)
browser.close()
[<selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-1")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-2")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-3")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-4")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-5")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-6")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-7")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-8")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-9")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-10")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-11")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-12")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-13")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-14")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-15")>, <selenium.webdriver.remote.webelement.WebElement (session="bca1503cd36be550e8dba984b55c5d0e", element="0.7914623408963901-16")>]

output

  • find_elements_by_name
  • find_elements_by_xpath
  • find_elements_by_link_text
  • find_elements_by_partial_link_text
  • find_elements_by_tag_name
  • find_elements_by_class_name
  • find_elements_by_css_selector

元素互交操作

对获取的元素调用交互方法

from selenium import webdriver
import time browser = webdriver.Chrome()
browser.get('https://www.taobao.com')
input = browser.find_element_by_id('q')
input.send_keys('iPhone')
time.sleep(1)
input.clear()
input.send_keys('iPad')
button = browser.find_element_by_class_name('btn-search')
button.click()

更多操作: http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.remote.webelement

交互动作

将动作附加到动作链中串行执行

from selenium import webdriver
from selenium.webdriver import ActionChains browser = webdriver.Chrome()
url = 'http://www.runoob.com/try/try.php?filename=jqueryui-api-droppable'
browser.get(url)
browser.switch_to.frame('iframeResult')
source = browser.find_element_by_css_selector('#draggable')
target = browser.find_element_by_css_selector('#droppable')
actions = ActionChains(browser)
actions.drag_and_drop(source, target)
actions.perform()

更多操作: http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.common.action_chains

执行JavaScript

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.zhihu.com/explore')
browser.execute_script('window.scrollTo(0, document.body.scrollHeight)')
browser.execute_script('alert("To Bottom")')

获取元素信息

获取属性

from selenium import webdriver
from selenium.webdriver import ActionChains browser = webdriver.Chrome()
url = 'https://www.zhihu.com/explore'
browser.get(url)
logo = browser.find_element_by_id('zh-top-link-logo')
print(logo)
print(logo.get_attribute('class'))
<selenium.webdriver.remote.webelement.WebElement (session="5a9e00352fbb1bb3df8b81a9f666cba9", element="0.7456057855126614-1")>
zu-top-link-logo

output

获取文本

from selenium import webdriver

browser = webdriver.Chrome()
url = 'https://www.zhihu.com/explore'
browser.get(url)
input = browser.find_element_by_class_name('zu-top-add-question')
print(input.text)
提问

获取ID、位置、标签名、大小

from selenium import webdriver

browser = webdriver.Chrome()
url = 'https://www.zhihu.com/explore'
browser.get(url)
input = browser.find_element_by_class_name('zu-top-add-question')
print(input.id)
print(input.location)
print(input.tag_name)
print(input.size)
0.6822924344980397-
{'y': , 'x': }
button
{'height': , 'width': }

Frame

import time
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException browser = webdriver.Chrome()
url = 'http://www.runoob.com/try/try.php?filename=jqueryui-api-droppable'
browser.get(url)
browser.switch_to.frame('iframeResult')
source = browser.find_element_by_css_selector('#draggable')
print(source)
try:
logo = browser.find_element_by_class_name('logo')
except NoSuchElementException:
print('NO LOGO')
browser.switch_to.parent_frame()
logo = browser.find_element_by_class_name('logo')
print(logo)
print(logo.text)
<selenium.webdriver.remote.webelement.WebElement (session="4bb8ac03ced4ecbdefef03ffdc0e4ccd", element="0.44746093888932004-1")>
NO LOGO
<selenium.webdriver.remote.webelement.WebElement (session="4bb8ac03ced4ecbdefef03ffdc0e4ccd", element="0.13792611320464965-2")>
RUNOOB.COM

output

等待

隐式等待

当使用了隐式等待执行测试的时候,如果 WebDriver没有在 DOM中找到元素,将继续等待,超出设定时间后则抛出找不到元素的异常, 换句话说,当查找元素或元素并没有立即出现的时候,隐式等待将等待一段时间再查找 DOM,默认的时间是0

from selenium import webdriver

browser = webdriver.Chrome()
browser.implicitly_wait(10)
browser.get('https://www.zhihu.com/explore')
input = browser.find_element_by_class_name('zu-top-add-question')
print(input)
<selenium.webdriver.remote.webelement.WebElement (session="b29214772d59e912f1ac52e96ed29abe", element="0.12886805191194894-1")>

output

显示等待

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC browser = webdriver.Chrome()
browser.get('https://www.taobao.com/')
wait = WebDriverWait(browser, 10)
input = wait.until(EC.presence_of_element_located((By.ID, 'q')))
button = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '.btn-search')))
print(input, button)
<selenium.webdriver.remote.webelement.WebElement (session="07dd2fbc2d5b1ce40e82b9754aba8fa8", element="0.5642646294074107-1")> <selenium.webdriver.remote.webelement.WebElement (session="07dd2fbc2d5b1ce40e82b9754aba8fa8", element="0.5642646294074107-2")>

output

  • title_is 标题是某内容
  • title_contains 标题包含某内容
  • presence_of_element_located 元素加载出,传入定位元组,如(By.ID, 'p')
  • visibility_of_element_located 元素可见,传入定位元组
  • visibility_of 可见,传入元素对象
  • presence_of_all_elements_located 所有元素加载出
  • text_to_be_present_in_element 某个元素文本包含某文字
  • text_to_be_present_in_element_value 某个元素值包含某文字
  • frame_to_be_available_and_switch_to_it frame加载并切换
  • invisibility_of_element_located 元素不可见
  • element_to_be_clickable 元素可点击
  • staleness_of 判断一个元素是否仍在DOM,可判断页面是否已经刷新
  • element_to_be_selected 元素可选择,传元素对象
  • element_located_to_be_selected 元素可选择,传入定位元组
  • element_selection_state_to_be 传入元素对象以及状态,相等返回True,否则返回False
  • element_located_selection_state_to_be 传入定位元组以及状态,相等返回True,否则返回False
  • alert_is_present 是否出现Alert

详细内容:http://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.support.expected_conditions

前进后退

import time
from selenium import webdriver browser = webdriver.Chrome()
browser.get('https://www.baidu.com/')
browser.get('https://www.taobao.com/')
browser.get('https://www.python.org/')
browser.back()
time.sleep(1)
browser.forward()
browser.close()

Cookies

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.zhihu.com/explore')
print(browser.get_cookies())
browser.add_cookie({'name': 'name', 'domain': 'www.zhihu.com', 'value': 'germey'})
print(browser.get_cookies())
browser.delete_all_cookies()
print(browser.get_cookies())
[{'secure': False, 'value': '"NGM0ZTM5NDAwMWEyNDQwNDk5ODlkZWY3OTkxY2I0NDY=|1491604091|236e34290a6f407bfbb517888849ea509ac366d0"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'l_cap_id', 'expiry': 1494196091.403418}, {'secure': False, 'value': '"YWEyOGY4MmI1MzQ2NGY5MmFiMjgzZGUzZWJjYTgwYjY=|1491604091|ff946847ddb5881245bdb7a5e6401b70dc61013f"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'cap_id', 'expiry': 1494196091.402855}, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'l_n_c'}, {'secure': False, 'value': '"MjcxMDE3YzU1YjI4NDljZjljNTQ4ZDIyOWJjZTBhNmY=|1491604091|8da4722b56a1545c2020dba97394a220c0eca8d9"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'r_cap_id', 'expiry': 1494196091.402525}, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmc'}, {'secure': False, 'value': '"AADCo7e1kguPTqvEOMieRUzwkA7ZUBhV-VY=|1491604091"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'd_c0', 'expiry': 1586212091.344773}, {'secure': False, 'value': '51854390.1491604091.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmz', 'expiry': }, {'secure': False, 'value': '3cc99fc5-8706-43fc-90ac-3ad991bd1a25', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '_zap', 'expiry': }, {'secure': False, 'value': '97cb00128ccb46659728f7c69cc191b0|1491604091000|1491604091000', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'q_c1', 'expiry': 1586212091.401644}, {'secure': False, 'value': '51854390.2.10.1491604091', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmb', 'expiry': }, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'n_c'}, {'secure': False, 'value': '51854390.000--|3=entry_date=20170408=1', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmv', 'expiry': }, {'secure': False, 'value': '51854390.669300758.1491604091.1491604091.1491604091.1', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utma', 'expiry': }, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmt', 'expiry': }, {'secure': False, 'value': 'AQAAALCZuwh+dgIAeu3PPHA+csDPnXvT', 'domain': 'www.zhihu.com', 'path': '/', 'httpOnly': True, 'name': 'aliyungf_tc'}]
[{'secure': False, 'value': 'germey', 'domain': '.www.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'name'}, {'secure': False, 'value': '"NGM0ZTM5NDAwMWEyNDQwNDk5ODlkZWY3OTkxY2I0NDY=|1491604091|236e34290a6f407bfbb517888849ea509ac366d0"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'l_cap_id', 'expiry': 1494196091.403418}, {'secure': False, 'value': '"YWEyOGY4MmI1MzQ2NGY5MmFiMjgzZGUzZWJjYTgwYjY=|1491604091|ff946847ddb5881245bdb7a5e6401b70dc61013f"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'cap_id', 'expiry': 1494196091.402855}, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'l_n_c'}, {'secure': False, 'value': '"MjcxMDE3YzU1YjI4NDljZjljNTQ4ZDIyOWJjZTBhNmY=|1491604091|8da4722b56a1545c2020dba97394a220c0eca8d9"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'r_cap_id', 'expiry': 1494196091.402525}, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmc'}, {'secure': False, 'value': '"AADCo7e1kguPTqvEOMieRUzwkA7ZUBhV-VY=|1491604091"', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'd_c0', 'expiry': 1586212091.344773}, {'secure': False, 'value': '51854390.1491604091.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmz', 'expiry': }, {'secure': False, 'value': '3cc99fc5-8706-43fc-90ac-3ad991bd1a25', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '_zap', 'expiry': }, {'secure': False, 'value': '97cb00128ccb46659728f7c69cc191b0|1491604091000|1491604091000', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'q_c1', 'expiry': 1586212091.401644}, {'secure': False, 'value': '51854390.2.10.1491604091', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmb', 'expiry': }, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': 'n_c'}, {'secure': False, 'value': '51854390.000--|3=entry_date=20170408=1', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmv', 'expiry': }, {'secure': False, 'value': '51854390.669300758.1491604091.1491604091.1491604091.1', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utma', 'expiry': }, {'secure': False, 'value': '', 'domain': '.zhihu.com', 'path': '/', 'httpOnly': False, 'name': '__utmt', 'expiry': }, {'secure': False, 'value': 'AQAAALCZuwh+dgIAeu3PPHA+csDPnXvT', 'domain': 'www.zhihu.com', 'path': '/', 'httpOnly': True, 'name': 'aliyungf_tc'}]
[]

output

选项卡管理

import time
from selenium import webdriver browser = webdriver.Chrome()
browser.get('https://www.baidu.com')
browser.execute_script('window.open()')
print(browser.window_handles)
browser.switch_to_window(browser.window_handles[1])
browser.get('https://www.taobao.com')
time.sleep(1)
browser.switch_to_window(browser.window_handles[0])
browser.get('https://python.org')
['CDwindow-4f58e3a7-7167-4587-bedf-9cd8c867f435', 'CDwindow-6e05f076-6d77-453a-a36c-32baacc447df']

output

异常处理

from selenium import webdriver

browser = webdriver.Chrome()
browser.get('https://www.baidu.com')
browser.find_element_by_id('hello')
NoSuchElementException                    Traceback (most recent call last)
<ipython-input--978945848a1b> in <module>()
browser = webdriver.Chrome()
browser.get('https://www.baidu.com')
----> browser.find_element_by_id('hello') /Library/Frameworks/Python.framework/Versions/3.5/lib/python3./site-packages/selenium/webdriver/remote/webdriver.py in find_element_by_id(self, id_)
driver.find_element_by_id('foo')
"""
--> return self.find_element(by=By.ID, value=id_) def find_elements_by_id(self, id_): /Library/Frameworks/Python.framework/Versions/3.5/lib/python3./site-packages/selenium/webdriver/remote/webdriver.py in find_element(self, by, value)
return self.execute(Command.FIND_ELEMENT, {
'using': by,
--> 'value': value})['value'] def find_elements(self, by=By.ID, value=None): /Library/Frameworks/Python.framework/Versions/3.5/lib/python3./site-packages/selenium/webdriver/remote/webdriver.py in execute(self, driver_command, params)
response = self.command_executor.execute(driver_command, params)
if response:
--> self.error_handler.check_response(response)
response['value'] = self._unwrap_value(
response.get('value', None)) /Library/Frameworks/Python.framework/Versions/3.5/lib/python3./site-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
elif exception_class == UnexpectedAlertPresentException and 'alert' in value:
raise exception_class(message, screen, stacktrace, value['alert'].get('text'))
--> raise exception_class(message, screen, stacktrace) def _value_or_default(self, obj, key, default): NoSuchElementException: Message: no such element: Unable to locate element: {"method":"id","selector":"hello"}
(Session info: chrome=57.0.2987.133)
(Driver info: chromedriver=2.27. (e97a722caafc2d3a8b807ee115bfb307f7d2cfd9),platform=Mac OS X 10.12. x86_64)

异常

from selenium import webdriver
from selenium.common.exceptions import TimeoutException, NoSuchElementException browser = webdriver.Chrome()
try:
browser.get('https://www.baidu.com')
except TimeoutException:
print('Time Out')
try:
browser.find_element_by_id('hello')
except NoSuchElementException:
print('No Element')
finally:
browser.close()
No Element

详细文档:http://selenium-python.readthedocs.io/api.html#module-selenium.common.exceptions

selenium基础用法(爬虫)的更多相关文章

  1. 【Python爬虫】selenium基础用法

    selenium 基础用法 阅读目录 初识selenium 基本使用 查找元素 元素互交操作 执行JavaScript 获取元素信息 等待 前进后退 Cookies 选项卡管理 异常处理 初识sele ...

  2. 爬虫简介、requests 基础用法、urlretrieve()

    1. 爬虫简介 2. requests 基础用法 3. urlretrieve() 1. 爬虫简介 爬虫的定义 网络爬虫(又被称为网页蜘蛛.网络机器人),是一种按照一定的规则,自动地抓取万维网信息的程 ...

  3. 使用Python + Selenium打造浏览器爬虫

    Selenium 是一款强大的基于浏览器的开源自动化测试工具,最初由 Jason Huggins 于 2004 年在 ThoughtWorks 发起,它提供了一套简单易用的 API,模拟浏览器的各种操 ...

  4. Python+Selenium基础入门及实践

    Python+Selenium基础入门及实践 32018.08.29 11:21:52字数 3220阅读 23422 一.Selenium+Python环境搭建及配置 1.1 selenium 介绍 ...

  5. asyncio 基础用法

    asyncio 基础用法 python也是在python 3.4中引入了协程的概念.也通过这次整理更加深刻理解这个模块的使用 asyncio 是干什么的? asyncio是Python 3.4版本引入 ...

  6. PropertyGrid控件由浅入深(二):基础用法

    目录 PropertyGrid控件由浅入深(一):文章大纲 PropertyGrid控件由浅入深(二):基础用法 控件的外观构成 控件的外观构成如下图所示: PropertyGrid控件包含以下几个要 ...

  7. logstash安装与基础用法

    若是搭建elk,建议先安装好elasticsearch 来自官网,版本为2.3 wget -c https://download.elastic.co/logstash/logstash/packag ...

  8. elasticsearch安装与基础用法

    来自官网,版本为2.3 注意elasticsearch依赖jdk,2.3依赖jdk7 下载rpm包并安装 wget -c https://download.elastic.co/elasticsear ...

  9. BigDecimal最基础用法

    BigDecimal最基础用法 用字符串生成的BigDecimal是不会丢精度的. 简单除法. public class DemoBigDecimal { public static void mai ...

随机推荐

  1. linux存储管理之基本分区

    基本分区管理 ====================================================================================基本分区(MBR| ...

  2. PostgreSQL work_mem理解

    官方说法: work_mem (integer) Specifies the amount of memory to be used by internal sort operations and h ...

  3. 百度地图API---JS开发

    百度地图API 开源地址:http://lbsyun.baidu.com/index.php?title=jspopular/guide/introduction#Https_.E8.AF.B4.E6 ...

  4. 牛客网多校第3场C-shuffle card 平衡树或stl(rope)

    链接:https://www.nowcoder.com/acm/contest/141/C 来源:牛客网 题目描述 Eddy likes to play cards game since there ...

  5. LoadRunner遇到的错误及解决方法

    1.返回的报文太长: intweb_set_max_html_param_len(const char * length); intweb_set_max_html_param_len(") ...

  6. excel表格中,怎么根据一列重复的数据求对应另一列总和

    如下: 求出姓名对应分数总和对应 : 首先复制一份Sheet2 对Sheet1进行操作选中A列姓名 数据->删除重复项->以前选中区域排序->删除重复项 然后删除对应成绩项选中张三对 ...

  7. Awesome Tools

    Awesome R: https://awesome-r.com/ (Chinese translation: http://www.ppvke.com/Blog/archives/40981) Aw ...

  8. VS2017调试技巧

    Visual Studio的调试技巧   调试技巧是衡量程序员水平的一个重要指标.掌握好的调试技巧与工具的使用方法,也是非常重要的.*** 演示环境: VS2017C#*** 演示用的代码: publ ...

  9. HTTP的缓存策略

    etag 与 if-match https://www.cnblogs.com/huangzhilong/p/4999207.html https://juejin.im/post/5c136bd16 ...

  10. 原生js手风琴效果

    //js代码 //获取li var list = document.getElementsByTagName("li")[0]; //遍历  排他 for( var i=0;i&l ...