1.参考

2.概念

Headless模式解决了什么问题: 自动化工具例如 selenium 利用有头浏览器进行测试,面临效率和稳定性的影响,所以出现了 Headless Browser, 3年前，无头浏览器 PhantomJS 已经如火如荼出现了，紧跟着 NightmareJS 也成为一名巨星。无头浏览器带来巨大便利性：页面爬虫、自动化测试、WebAutomation... 用过PhantomJS的都知道，它的环境是运行在一个封闭的沙盒里面，在环境内外完全不可通信，包括API、变量、全局方法调用等。

So, Chrome59 推出了 headless mode，Chrome59版支持的特性，全部可以利用：
ES2017
ServiceWork(PWA测试随便耍)
无沙盒环境
无痛通讯&API调用
无与伦比的速度

https://developers.google.com/web/updates/2017/04/headless-chrome Getting Started with Headless Chrome

https://jiayi.space/post/zai-ubuntufu-wu-qi-shang-shi-yong-chrome-headless 在ubuntu服务器上使用Chrome Headless 845

UA有所差异：User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/63.0.3239.84 Safari/537.36

解决办法：chrome_options.add_argument("user-agent='Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36'")

https://0x0d.im/archives/headless-browser-detection.html 无头浏览器异闻录

我们如何区分这些无头浏览器和正常的浏览器呢？从 Server Side 分析用户行为进行检测是一劳永逸的方法，但成本和难度都很大。不过通过无头浏览器的一些特性，我们也可以从 Client Side 找出一些不同来。

3.代码

#coding:utf-8

from selenium import webdriver

url = "http://demo.testfire.net"

chrome_options = webdriver.ChromeOptions()

chrome_options.add_argument('--headless')

chrome_options.add_argument('--disable-gpu')

driver = webdriver.Chrome(chrome_options=chrome_options,executable_path='/Users/xxxx/driver/chromedriver')

driver.get('http://demo.testfire.net')

driver.find_element_by_xpath('//*[@id="_ctl0__ctl0_LoginLink"]').click()

driver.find_element_by_xpath('//*[@id="uid"]').clear()

driver.find_element_by_xpath('//*[@id="uid"]').send_keys('admin')

driver.find_element_by_xpath('//*[@id="passw"]').send_keys('admin')

driver.find_element_by_xpath('//*[@id="login"]/table/tbody/tr[3]/td[2]/input').click()

print driver.current_url

最后 print 出登录成功的当前 url:http://demo.testfire.net/bank/main.aspx

20171227 更新：仅设置参数 '--headless' 会报错，而仅设置参数 '--disable-gpu' 则会自动补充 '--headless'

63.0.3239.84（正式版本）（64 位）

In [43]: from selenium import webdriver

In [44]: chrome_options = webdriver.ChromeOptions()

In [45]: chrome_options.add_argument('--headless')

In [46]: driver = webdriver.Chrome(chrome_options=chrome_options)

DEBUG:selenium.webdriver.remote.remote_connection:POST http://127.0.0.1:63614/session {"capabilities": {"alwaysMatch": {"platform": "ANY", "browserName": "chrome", "version": "", "chromeOptions": {"ar

gs": ["--headless"], "extensions": []}}, "firstMatch": []}, "desiredCapabilities": {"platform": "ANY", "browserName": "chrome", "version": "", "chromeOptions": {"args": ["--headless"], "extensions": [

]}}}

DevTools listening on ws://127.0.0.1:12481/devtools/browser/aa60bb34-e8a5-4740-909e-aa3e6f315376

[1227/173400.592:ERROR:gpu_main.cc(164)] Exiting GPU process due to errors during initialization

[1227/173400.606:ERROR:browser_gpu_channel_host_factory.cc(107)] Failed to launch GPU process.

DEBUG:selenium.webdriver.remote.remote_connection:Finished Request

In [47]: chrome_options.add_argument('--disable-gpu')

In [48]: driver = webdriver.Chrome(chrome_options=chrome_options)

DEBUG:selenium.webdriver.remote.remote_connection:POST http://127.0.0.1:63672/session {"capabilities": {"alwaysMatch": {"platform": "ANY", "browserName": "chrome", "version": "", "chromeOptions": {"ar

gs": ["--headless", "--disable-gpu"], "extensions": []}}, "firstMatch": []}, "desiredCapabilities": {"platform": "ANY", "browserName": "chrome", "version": "", "chromeOptions": {"args": ["--headless",

 "--disable-gpu"], "extensions": []}}}

DevTools listening on ws://127.0.0.1:12452/devtools/browser/c175a776-71be-40c0-aa73-a64c253f1cb0

DEBUG:selenium.webdriver.remote.remote_connection:Finished Request

In [49]: driver.get('http://httpbin.org')

DEBUG:selenium.webdriver.remote.remote_connection:POST http://127.0.0.1:63672/session/026f8a02c2cd9bf7c7688c6b2934cd66/url {"url": "http://httpbin.org", "sessionId": "026f8a02c2cd9bf7c7688c6b2934cd66"

}

DEBUG:selenium.webdriver.remote.remote_connection:Finished Request

In [56]: chrome_ops = webdriver.ChromeOptions()

In [57]: chrome_ops.add_argument('--disable-gpu')

In [58]: dri = webdriver.Chrome(chrome_options=chrome_options)

DEBUG:selenium.webdriver.remote.remote_connection:POST http://127.0.0.1:64092/session {"capabilities": {"alwaysMatch": {"platform": "ANY", "browserName": "chrome", "version": "", "chromeOptions": {"ar

gs": ["--headless", "--disable-gpu"], "extensions": []}}, "firstMatch": []}, "desiredCapabilities": {"platform": "ANY", "browserName": "chrome", "version": "", "chromeOptions": {"args": ["--headless",

 "--disable-gpu"], "extensions": []}}}

DevTools listening on ws://127.0.0.1:12713/devtools/browser/b873f662-1ca6-4cd0-a6cd-f9bd13d7e236

DEBUG:selenium.webdriver.remote.remote_connection:Finished Request

【转】利用 selenium 的 webdrive 驱动 headless chrome的更多相关文章

selenium（六）Headless Chrome/Firefox--PhantomJS停止支持后，使用无界面模式。
简介: 以前都用PhantomJS来进行无界面模式的自动化测试,或者爬取某些动态页面. 但是最近selenium更新以后,'Selenium support for PhantomJS has bee ...
Python驱动Headless Chrome
Headelss 比Headed的浏览器在内存消耗,运行时间,CPU占用都更具优势 from selenium import webdriverfrom selenium.webdriver.chro ...
Selenium及Headless Chrome抓取动态HTML页面
一般的的静态HTML页面可以使用requests等库直接抓取,但还有一部分比较复杂的动态页面,这些页面的DOM是动态生成的,有些还需要用户与其点击互动,这些页面只能使用真实的浏览器引擎动态解析,Sel ...
【python爬虫】利用selenium和Chrome浏览器进行自动化网页搜索与浏览
功能简介:利用利用selenium和Chrome浏览器,让其自动打开百度页面,并设置为每页显示50条,接着在百度的搜索框中输入selenium,进行查询.然后再打开的页面中选中“Selenium - ...
爬虫（三）通过Selenium + Headless Chrome爬取动态网页
一.Selenium Selenium是一个用于Web应用程序测试的工具,它可以在各种浏览器中运行,包括Chrome,Safari,Firefox 等主流界面式浏览器. 我们可以直接用pip inst ...
selenium+headless chrome安装使用
pip install selenium 因为phantomJS将停止维护,所以建议使用headless chromeChromeDriver is a separate executable tha ...
爬虫（四）Selenium + Headless Chrome爬取Bing图片搜索结果
Bing图片搜索结果是动态加载的,如果我们直接用requests去访问页面爬取数据,那我们只能拿到很少的图片.所以我们使用Selenium + Headless Chrome来爬取搜索结果.在开始前, ...
Web自动化之Headless Chrome测试框架集成
使用Selenium操作headless chrome 推荐简介 WebDriver是一个W3C标准, 定义了一套检查和控制用户代理(比如浏览器)的远程控制接口,各大主流浏览器来实现这些接口以便调用 ...
利用Selenium自动化web测试
简介: Selenium 是一个没有正式指导手册的开源项目,这让测试人员的问题调查很费时间.本文为基于 Selenium 1.0(发布于 2009 年 6 月)的测试期间的常见问题提供最佳实践. 简介 ...

随机推荐

用ARX自定义实体
本文介绍了构造自定义实体的步骤.必须继承的函数和必须注意的事项 1.新建一个从AcDbEntity继承的类,如EntTest,必须添加的头文件: "stdarx.h",&quo ...
linux系统网络相关问题
暂时将你的 eth0 这张网络卡的 IP 设定为 192.168.1.100 ,如何进行? ifconfig eth0 192.168.1.100 我要增加一个路由规则,以 eth0 连接 192.1 ...
NLog类库使用探索——详解配置
1 配置文件的位置(Configuration file locations) 通过在启动的时候对一些常用目录的扫描,NLog会尝试使用找到的配置信息进行自动的自我配置. 1.1 单独的*.exe客户 ...
python beautifulsoup爬虫学习
BeautifulSoup(page_html, "lxml").select(),这里可以通过浏览器开发者模式选择copy selector,并且并不需要完整路径. github ...
MySQL的连接数
我使用的数据库,没有针对其进行其他相关设置,最近经常出现连接异常,现象为太多的连接. MySQL查看最大连接数和修改最大连接数 1.查看最大连接数(可通过show variables查看其他的全局参数 ...
（一）七种AOP实现方法
在这里列表了我想到的在你的应用程序中加入AOP支持的所有方法.这里最主要的焦点是拦截,因为一旦有了拦截其它的事情都是细节. Approach 方法 Advantages 优点 Disadvantage ...
前端----css 选择器
css 为了修饰页面作用, 让页面好看 ⑴ css的引入方式1,行内样式body里面2,内接样式在html里面的 style 里面3,外接样式两种:①链接式: <link rel=" ...
REST风格接口测试利器Wisdom rest-client
前言偶然间接触到Wisdom rest-client这款测试工具,后来经过尝试体验,感觉还不错,现在分享给大家,如何使用这款测试利器 Wisdom rest-client是什么? Wisdom re ...
IDEA中Git更新合并代码后，本地修改丢失
IDEA中,使用Git下载了远程服务器的代码,发现自己修改的代码不在了,此时并没有提交,所以在show history中查看不到,慌死了. 因为有冲突的地方,没有办法合并,所以直接使用了远程的代码无 ...
路由跟踪表满，日志报错nf_conntrack: table full, dropping packet.
“连接跟踪表已满,开始丢包”!相信不少用iptables的同学都会见过这个错误信息吧,这个问题曾经也困扰过我好长一段时间.此问题的解决办法有四种(nf_conntrack 在CentOS 5 / ke ...

【转】利用 selenium 的 webdrive 驱动 headless chrome

1.参考

2.概念

3.代码

【转】利用 selenium 的 webdrive 驱动 headless chrome的更多相关文章

随机推荐

热门专题