爬取https页面遇到“SSLError: hostname 'xxx' doesn't match either of”的解决方法

　　使用python requests 框架包访问https://itunes.apple.com 页面是遇到 SSLError: hostname 'itunes.apple.com' doesn't match either of 错误，

但本地开发时没报错，而是在线上报错，python版本是2.7.3，后来gg到是因为版本问题，安装好pyOpenSSL 、ndg-httpsclient、pyasn1 即可，发现确实可行。

These errors occur when SSL certificate verification fails to match the certificate the server responds with to the hostname Requests thinks it's contacting. If you're certain the server's SSL setup is correct (for example, because you can visit the site with your browser) and you're using Python 2.6 or 2.7, a possible explanation is that you need Server-Name-Indication.

Server-Name-Indication, or SNI, is an official extension to SSL where the client tells the server what hostname it is contacting. This is important when servers are using Virtual Hosting. When such servers are hosting more than one SSL site they need to be able to return the appropriate certificate based on the hostname the client is connecting to.

Python3 and Python 2.7.9+ include native support for SNI in their SSL modules. For information on using SNI with Requests on Python < 2.7.9 refer to this Stack Overflow answer.

The current version of Requests should be just fine with SNI. Further down the GitHub issue you can see the requirements:

pyOpenSSL

ndg-httpsclient

pyasn1

Try installing those packages and then give it another shot.

参考资料：

1、http://docs.python-requests.org/en/master/community/faq/#what-are-hostname-doesn-t-match-errors

2、 https://stackoverflow.com/questions/18578439/using-requests-with-tls-doesnt-give-sni-support/18579484#18579484

爬取https页面遇到“SSLError: hostname 'xxx' doesn't match either of”的解决方法的更多相关文章

node后台fetch请求数据-Hostname/IP doesn't match certificate's altnames解决方法
一.问题背景基于express框架,node后台fetch请求数据,报错Hostname/IP doesn't match certificate's altnames..... require(' ...
一个爬取https和http通用的工具类(JDK自带的URL的用法)
今天在java爬取天猫的时候因为ssl报错,所以从网上找了一个可以爬取https和http通用的工具类.但是有的时候此工具类爬到的数据不全,此处不得不说python爬虫很厉害. package cn. ...
scrapy模拟浏览器爬取验证码页面
使用selenium模块爬取验证码页面,selenium模块需要另外安装这里不讲环境的配置,我有一篇博客有专门讲ubuntn下安装和配置模拟浏览器的开发 spider的代码 # -*- coding: ...
scrapy(四): 爬取二级页面的内容
scrapy爬取二级页面的内容 1.定义数据结构item.py文件 # -*- coding: utf-8 -*- ''' field: item.py ''' # Define here the m ...
Scrapy爬取静态页面
Scrapy爬取静态页面安装Scrapy框架: Scrapy是python下一个非常有用的一个爬虫框架 Pycharm下: 搜索Scrapy库添加进项目即可终端下: #python2 sudo p ...
爬取百度页面代码写入到文件+web请求过程解析
一.爬取百度页面代码写入到文件代码示例: from urllib.request import urlopen #导入urlopen包 url="http://www.baidu.com& ...
史上最全的CSS hack方式一览 jQuery 图片轮播的代码分离 JQuery中的动画 C#中Trim()、TrimStart()、TrimEnd()的用法 marquee 标签的使用详情 js鼠标事件 js添加遮罩层页面上通过地址栏传值时出现乱码的两种解决方法 ref和out的区别在c#中总结
史上最全的CSS hack方式一览 2013年09月28日 15:57:08 阅读数:175473 做前端多年,虽然不是经常需要hack,但是我们经常会遇到各浏览器表现不一致的情况.基于此,某些情况我 ...
node调用phantomjs-node爬取复杂页面
什么是phantomjs phantomjs官网是这么说的,'整站测试,屏幕捕获,自动翻页,网络监控',目前比较流行用来爬取复杂的,难以通过api或正则匹配的页面,比如页面是通过异步加载.phanto ...
scrapy爬取相似页面及回调爬取问题（以慕课网为例）
以爬取慕课网数据为例慕课网的数据很简单,就是通过get方式获取的连接地址为https://www.imooc.com/course/list?page=2 根据page参数来分页

随机推荐

java工具类
1.HttpUtilsHttp网络工具类,主要包括httpGet.httpPost以及http参数相关方法,以httpGet为例:static HttpResponse httpGet(HttpReq ...
git免密操作
windows下找到用户目录,新建 _netrc 文件 machine git.notech.cc login user password xxxxxx Linux下同样可行,需要在~目录下新建 .n ...
c# (ENUM)枚举组合类型的谷歌序列化Protobuf
c# (ENUM)枚举组合类型的谷歌序列化Protobuf,必须在序列化/反序列化时加上下面: RuntimeTypeModel.Default[typeof(Alarm)].EnumPassthru ...
NHibernate之映射文件配置说明
NHibernate之映射文件配置说明 1. hibernate-mapping 这个元素包括以下可选的属性.schema属性,指明了这个映射所引用的表所在的schema名称.假若指定了这个属性, 表 ...
IT培训行业揭秘（一）
最近一个多月来,身边有很多朋友问我,我家孩子明年就要大学毕业了,现在工作还没有着落,最近孩子回家经常和我说,他们学校最近来了很多IT培训班,让同学们参加培训,然后各个培训班动辄拿出往届他们的培训学生赚 ...
【数据结构】简单谈一谈二分法和二叉排序树BST查找的比较
二分法查找: 『在有序数组的基础上通过折半方法不断缩小查找范围,直至命中或者查询失败.』二分法的存储要求:要求顺序存储,以便于根据下标随机访问二分法的时间效率:O(Log(n)) 二分 ...
jquery 监听常用监听方法
最近在做网站开发,需要用到不少js的知识.之前学过现在重新来看,发现还真忘了不少~~ 在使用基于bootstrap,或者基于 jquery 的插件时,如过没有出现预期效果请最先检查下是否优先载入的 ...
国内git项目托管平台
以前一直使用github托管项目,最近换了阿里云的vps,连接github出奇的慢,找了一下国内的代码托管平台. 有几个都不错,我刚好有csdn的账号,就试了一下csdn的托管平台,创建一个项目,发现 ...
C#-WebForm-Repeater-重复器
Repeater-重复器 - 类似WinForm中的ListView,用列表来展示数据格式: <body> <form id="form1" runat=&qu ...
php curl获取的数据不直接输出
curl获取页面内容,不直接输出到页面必需设置curl的CURLOPT_RETURNTRANSFER选项为1或true curl_setopt($ch, CURLOPT_RETURNTRANSFER ...

爬取https页面遇到“SSLError: hostname 'xxx' doesn't match either of”的解决方法

爬取https页面遇到“SSLError: hostname 'xxx' doesn't match either of”的解决方法的更多相关文章

随机推荐

热门专题