Python库积累之Selenium（二）-Seleniun中的一些问题与解决方法

常见问题与解决：

1.selenium中出现提示’Your connection is not private’（你的连接不是私密连接）

在Chrome中需要添加忽略认证错误：

from selenium import webdriver

options = webdriver.ChromeOptions()

options.add_argument('ignore-certificate-errors')

driver = webdriver.Chrome(chrome_options=options)

driver.get('https://cacert.org/')

driver.close()

在 Firefox中设置accept_untrusted_certs为True：

from selenium import webdriver

profile = webdriver.FirefoxProfile()

profile.accept_untrusted_certs = True

driver = webdriver.Firefox(firefox_profile=profile)

driver.get('https://cacert.org/')

driver.close()

原回答：https://stackoverflow.com/questions/24507078/how-to-deal-with-certificates-using-selenium

2.鼠标悬停与选择下拉

下拉列表的时候，存在两种情况。第一种是有select标签的，这种情况下可以通过from selenium.webdriver.support.ui import Select方式实现

具体selenium代码为：

'''

第一种情形：可以通过 from selenium.webdriver.support.ui import Select

'''

from selenium import webdriver

from selenium.webdriver.support.ui import Select

driver = webdriver.Chrome()

driver.get('https://www.17sucai.com/pins/demo-show?id=5926')

# 切换ifrane

driver.switch_to_frame(driver.find_element_by_id('iframe'))

# 找到下拉框

selectTag = Select(driver.find_element_by_name('country-wrap'))  # select标签

# 获得选择项

# 1.根据值来选择

selectTag.select_by_value('CA')

# 2.根据索引来选择

# selectTag.select_by_index(3)

但也存在没有select标签的下拉列表，这时候就需要我们手动链接到该位置。如图片情形所示，就是a标签，不是select标签，无法通过from selenium.webdriver.support.ui import Select方式实现

'''

第二种情形：手动点击

'''

from selenium import webdriver

from selenium.webdriver.support.ui import Select

driver = webdriver.Chrome()

driver.get('https://www.17sucai.com/pins/demo-show?id=5926')

# 切换ifrane

driver.switch_to_frame(driver.find_element_by_id('iframe'))

# 找到下拉框

selectTag = driver.find_element_by_xpath('//*[@id="dk_container_country-nofake"]').click() # 点击下拉列表位置

# 获得下拉选择项

driver.find_element_by_xpath('//*[@id="dk_container_country-nofake"]/div/ul/li[1]/a').click()

原文：https://blog.csdn.net/Claire_chen_jia/article/details/106523131

3.下载文件中文乱码/将浏览器设置为中文/改变编码

如果下载中文文件后文件名为乱码，则需要配置对应浏览器设置

options.add_argument('lang=zh_CN.UTF-8')

selenium+python配置chrome浏览器详解https://blog.csdn.net/zwq912318834/article/details/78933910

4.不显示UI调用浏览器

在不打开UI界面的情况下使用 Chrome 浏览器。用法：

option=webdriver.ChromeOptions()

option.add_argument('headless')

driver=webdriver.Chrome(chrome_options=option)

5.直接用cookie登录方法

先手动获取网页的cookie，将其序列化并存储在本地
使用到一个chrome插件EditThisCookiehttp://www.editthiscookie.com/
它有个导出功能，当你登录完后点击导出便会得到一个list格式的字符串,稍加修改就可以作为python的list来导入cookie了

#导入cookie

for item in cookies:

    driver.add_cookie(item)

https://www.jianshu.com/p/773c58406bdb

6.selenium下载文件到指定的文件夹

在爬虫的时候会遇到下载文件的情况，这时候如果用Chrome浏览器点击下载，文件会自动存放到默认文件夹，一般是我的电脑>下载这个路径，如果我们想下载到指定文件夹，有没有办法呢？，可以试试下面的方法，在启动driver的时候就指定一个默认下载路径

from selenium import webdriver

options = webdriver.ChromeOptions()

out_path = r'D:\Projects\Spiders'  # 是你想指定的路径

prefs = {'profile.default_content_settings.popups': 0, 'download.default_directory': out_path}

options.add_experimental_option('prefs', prefs)

browser = webdriver.Chrome(executable_path=r'D:\Repo 3\chromedriver.exe', chrome_options=options)

7.判断文件是否下载完成

https://stackoverflow.com/questions/34338897/python-selenium-find-out-when-a-download-has-completed

↓↓↓欢迎关注我的公众号，在这里有数据相关技术经验的优质原创文章↓↓↓