requests库核心API源码分析
requests库是python爬虫使用频率最高的库,在网络请求中发挥着重要的作用,这边文章浅析requests的API源码。
该库文件结构如图:
提供的核心接口在__init__文件中,如下:
from . import utils
from . import packages
from .models import Request, Response, PreparedRequest
from .api import request, get, head, post, patch, put, delete, options
from .sessions import session, Session
from .status_codes import codes
from .exceptions import ( RequestException, Timeout, URLRequired, TooManyRedirects, HTTPError, ConnectionError, FileModeWarning, ConnectTimeout, ReadTimeout )
requests常用方法在api.py文件中,源码如下:
# -*- coding: utf-8 -*- """ requests.api ~~~~~~~~~~~~ This module implements the Requests API. :copyright: (c) 2012 by Kenneth Reitz. :license: Apache2, see LICENSE for more details. """ from . import sessions def request(method, url, **kwargs): """Constructs and sends a :class:`Request <Request>`. :param method: method for the new :class:`Request` object. :param url: URL for the new :class:`Request` object. :param params: (optional) Dictionary, list of tuples or bytes to send in the body of the :class:`Request`. :param data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:`Request`. :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`. :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`. :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`. :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload. ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')`` or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers to add for the file. :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth. :param timeout: (optional) How many seconds to wait for the server to send data before giving up, as a float, or a :ref:`(connect timeout, read timeout) <timeouts>` tuple. :type timeout: float or tuple :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``. :type allow_redirects: bool :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy. :param verify: (optional) Either a boolean, in which case it controls whether we verify the server's TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to ``True``. :param stream: (optional) if ``False``, the response content will be immediately downloaded. :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair. :return: :class:`Response <Response>` object :rtype: requests.Response Usage:: >>> import requests >>> req = requests.request('GET', 'https://httpbin.org/get') <Response [200]> """ # By using the 'with' statement we are sure the session is closed, thus we # avoid leaving sockets open which can trigger a ResourceWarning in some # cases, and look like a memory leak in others. with sessions.Session() as session: return session.request(method=method, url=url, **kwargs) def get(url, params=None, **kwargs): r"""Sends a GET request. :param url: URL for the new :class:`Request` object. :param params: (optional) Dictionary, list of tuples or bytes to send in the body of the :class:`Request`. :param \*\*kwargs: Optional arguments that ``request`` takes. :return: :class:`Response <Response>` object :rtype: requests.Response """ kwargs.setdefault('allow_redirects', True) return request('get', url, params=params, **kwargs) def options(url, **kwargs): r"""Sends an OPTIONS request. :param url: URL for the new :class:`Request` object. :param \*\*kwargs: Optional arguments that ``request`` takes. :return: :class:`Response <Response>` object :rtype: requests.Response """ kwargs.setdefault('allow_redirects', True) return request('options', url, **kwargs) def head(url, **kwargs): r"""Sends a HEAD request. :param url: URL for the new :class:`Request` object. :param \*\*kwargs: Optional arguments that ``request`` takes. :return: :class:`Response <Response>` object :rtype: requests.Response """ kwargs.setdefault('allow_redirects', False) return request('head', url, **kwargs) def post(url, data=None, json=None, **kwargs): r"""Sends a POST request. :param url: URL for the new :class:`Request` object. :param data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:`Request`. :param json: (optional) json data to send in the body of the :class:`Request`. :param \*\*kwargs: Optional arguments that ``request`` takes. :return: :class:`Response <Response>` object :rtype: requests.Response """ return request('post', url, data=data, json=json, **kwargs) def put(url, data=None, **kwargs): r"""Sends a PUT request. :param url: URL for the new :class:`Request` object. :param data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:`Request`. :param json: (optional) json data to send in the body of the :class:`Request`. :param \*\*kwargs: Optional arguments that ``request`` takes. :return: :class:`Response <Response>` object :rtype: requests.Response """ return request('put', url, data=data, **kwargs) def patch(url, data=None, **kwargs): r"""Sends a PATCH request. :param url: URL for the new :class:`Request` object. :param data: (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the :class:`Request`. :param json: (optional) json data to send in the body of the :class:`Request`. :param \*\*kwargs: Optional arguments that ``request`` takes. :return: :class:`Response <Response>` object :rtype: requests.Response """ return request('patch', url, data=data, **kwargs) def delete(url, **kwargs): r"""Sends a DELETE request. :param url: URL for the new :class:`Request` object. :param \*\*kwargs: Optional arguments that ``request`` takes. :return: :class:`Response <Response>` object :rtype: requests.Response """ return request('delete', url, **kwargs)
常用的get、post、put、optins、delete方法都在该文件中实现,这些方法都是使用内部封装的一个模块:request,而request是对session.request内部模块的封装,提供一个上下文管理。
继续看最为核心的session.request模块源码:
def request(self, method, url, ······· # Create the Request. req = Request( method=method.upper(), url=url, headers=headers, files=files, data=data or {}, json=json, params=params or {}, auth=auth, cookies=cookies, hooks=hooks, ) prep = self.prepare_request(req) proxies = proxies or {} settings = self.merge_environment_settings( prep.url, proxies, stream, verify, cert ) # Send the request. send_kwargs = { 'timeout': timeout, 'allow_redirects': allow_redirects, } send_kwargs.update(settings) resp = self.send(prep, **send_kwargs) return resp
在这里提交过来的请求信息将组装成Request请求对象,并对其中的配置参数进行合并,然后将Request请求和配置参数发送给self.send,来请求下载,继续看self.send
def send(self, request, **kwargs): """Send a given PreparedRequest. :rtype: requests.Response """ # Set defaults that the hooks can utilize to ensure they always have # the correct parameters to reproduce the previous request. kwargs.setdefault('stream', self.stream) kwargs.setdefault('verify', self.verify) kwargs.setdefault('cert', self.cert) kwargs.setdefault('proxies', self.proxies) # It's possible that users might accidentally send a Request object. # Guard against that specific failure case. if isinstance(request, Request): raise ValueError('You can only send PreparedRequests.') # Set up variables needed for resolve_redirects and dispatching of hooks allow_redirects = kwargs.pop('allow_redirects', True) stream = kwargs.get('stream') hooks = request.hooks # Get the appropriate adapter to use adapter = self.get_adapter(url=request.url) # Start time (approximately) of the request start = preferred_clock() # Send the request r = adapter.send(request, **kwargs) # Total elapsed time of the request (approximately) elapsed = preferred_clock() - start r.elapsed = timedelta(seconds=elapsed) # Response manipulation hooks r = dispatch_hook('response', hooks, r, **kwargs) # Persist cookies if r.history: # If the hooks create history then we want those cookies too for resp in r.history: extract_cookies_to_jar(self.cookies, resp.request, resp.raw) extract_cookies_to_jar(self.cookies, request, r.raw) # Redirect resolving generator. gen = self.resolve_redirects(r, request, **kwargs) # Resolve redirects if allowed. history = [resp for resp in gen] if allow_redirects else [] # Shuffle things around if there's history. if history: # Insert the first (original) request at the start history.insert(0, r) # Get the last request made r = history.pop() r.history = history # If redirects aren't being followed, store the response on the Request for Response.next(). if not allow_redirects: try: r._next = next(self.resolve_redirects(r, request, yield_requests=True, **kwargs)) except StopIteration: pass if not stream: r.content return r
当然在self.send中核心的是下面几行行代码:
# Start time (approximately) of the request start = preferred_clock() # Send the request r = adapter.send(request, **kwargs) # Total elapsed time of the request (approximately) elapsed = preferred_clock() - start r.elapsed = timedelta(seconds=elapsed) # Response manipulation hooks r = dispatch_hook('response', hooks, r, **kwargs)
如果还有问题未能得到解决,搜索887934385交流群,进入后下载资料工具安装包等。最后,感谢观看!
分别进行请求,并将请求响应内容构造成响应对象r,其中又引入本地模块adapter,该模块主要负责请求处理及其响应内容。
requests库实现很巧妙,对cookie保持、代理问题、SSL验证问题都做了处理,功能很全,其中细节不仔细去研读很难理解,这里只是对其实现过程做一个浅析,如果有感兴趣的同学,可以仔细研读每个模块和功能,其中有奥妙。
requests库核心API源码分析的更多相关文章
- Spring Security(四) —— 核心过滤器源码分析
摘要: 原创出处 https://www.cnkirito.moe/spring-security-4/ 「老徐」欢迎转载,保留摘要,谢谢! 4 过滤器详解 前面的部分,我们关注了Spring Sec ...
- MapReduce新版客户端API源码分析
使用MapReduce新版客户端API提交MapReduce Job需要使用 org.apache.hadoop.mapreduce.Job 类.JavaDoc给出以下使用范例. // Create ...
- Android核心类源码分析
Handler流程1.首先Looper.prepare()在本线程中保存一个Looper实例,然后该实例中保存一个MessageQueue对象:因为Looper.prepare()在一个线程中只能调用 ...
- ABP源码分析二十六:核心框架中的一些其他功能
本文是ABP核心项目源码分析的最后一篇,介绍一些前面遗漏的功能 AbpSession AbpSession: 目前这个和CLR的Session没有什么直接的联系.当然可以自定义的去实现IAbpSess ...
- Java源码分析之LinkedList
LinkedList与ArrayList正好相对,同样是List的实现类,都有增删改查等方法,但是实现方法跟后者有很大的区别. 先归纳一下LinkedList包含的API 1.构造函数: ①Linke ...
- Android base-adapter-helper 源码分析与扩展
转载请标明出处:http://blog.csdn.net/lmj623565791/article/details/44014941,本文出自:[张鸿洋的博客] 本篇博客是我加入Android 开源项 ...
- Java Collections 源码分析
Java Collections API源码分析 侯捷老师剖析了不少Framework,如MFC,STL等.侯老师有句名言: 源码面前,了无秘密 这句话还在知乎引起广泛讨论. 我对教授程序设计的一点想 ...
- WebBench源码分析
源码分析共享地址:https://github.com/fivezh/WebBench 下载源码后编译源程序后即可执行: sudo make clean sudo make & make in ...
- redis源码分析之事务Transaction(下)
接着上一篇,这篇文章分析一下redis事务操作中multi,exec,discard三个核心命令. 原文地址:http://www.jianshu.com/p/e22615586595 看本篇文章前需 ...
随机推荐
- Vs使用EF来操作MySql(经验 )
1.安装Vs2015,至少是2012以上版本才行 2. 安装 这个是用于连接Mysql数据库的Vs插件 2.1通过这种方式添加引用 3.配置数据库 // // // 4.添加实体 注:这里最好从数据库 ...
- Spring Cloud gateway 网关四 动态路由
微服务当前这么火爆的程度,如果不能学会一种微服务框架技术.怎么能升职加薪,增加简历的筹码?spring cloud 和 Dubbo 需要单独学习.说没有时间?没有精力?要学俩个框架?而Spring C ...
- 使用Typescript重构axios(十六)——请求和响应数据配置化
0. 系列文章 1.使用Typescript重构axios(一)--写在最前面 2.使用Typescript重构axios(二)--项目起手,跑通流程 3.使用Typescript重构axios(三) ...
- 使用Typescript重构axios(二十四)——防御XSRF攻击
0. 系列文章 1.使用Typescript重构axios(一)--写在最前面 2.使用Typescript重构axios(二)--项目起手,跑通流程 3.使用Typescript重构axios(三) ...
- python学习之【第十七篇】:Python中的面向对象(类和对象)
1.什么是类和类的对象? 类是一种数据结构,我们可以用它来定义对象,后者把数据值和行为特性融合在一起,类是现实世界的抽象的实体以编程形式出现.实例是这些对象的具体化.类是用来描述一类事物,类的对象指的 ...
- python机器学习——随机梯度下降
上一篇我们实现了使用梯度下降法的自适应线性神经元,这个方法会使用所有的训练样本来对权重向量进行更新,也可以称之为批量梯度下降(batch gradient descent).假设现在我们数据集中拥有大 ...
- 003.Kubernetes二进制部署准备
一 前置准备 1.1 前置条件 相应的充足资源的Linux服务器: 设置相应的主机名,参考命令: hostnamectl set-hostname k8smaster Mac及UUID唯一: 若未关闭 ...
- Python连接SqlServer+GUI嵌入式——学生管理系统1.0
学生管理系统1.0 1.建学生数据库 2.数据库嵌入高级语言(Python) 3.界面设计 简化思路: 1.先通过SqlServer2012建立学生数据库,包括账号.密码,姓名.选课等信息 2.运用P ...
- 关于Prometheus监控的思考:多标签埋点及Mbean
使用 grafana+prometheus+jmx 作为普通的监控手段,是比较有用的.我之前的文章介绍了相应的实现办法. 但是,按照之前的实现,我们更多的只能是监控 单值型的数据,如请求量,tps 等 ...
- 【最新发布】最新Python学习路线,值得收藏
随着AI的发展,Python的薪资也在逐年增加,但是很多初学者会盲目乱学,连正确的学习路线都不清楚,踩很多坑,为此经过我多年开发经验以及对目前行业发展形式总结出一套最新python学习路线,帮助大家正 ...