requests库核心API源码分析
requests库是python爬虫使用频率最高的库,在网络请求中发挥着重要的作用,这边文章浅析requests的API源码。
该库文件结构如图:

提供的核心接口在__init__文件中,如下:
from . import utils
from . import packages
from .models import Request, Response, PreparedRequest
from .api import request, get, head, post, patch, put, delete, options
from .sessions import session, Session
from .status_codes import codes
from .exceptions import ( RequestException, Timeout, URLRequired, TooManyRedirects, HTTPError, ConnectionError, FileModeWarning, ConnectTimeout, ReadTimeout )
requests常用方法在api.py文件中,源码如下:
# -*- coding: utf-8 -*-
"""
requests.api
~~~~~~~~~~~~
This module implements the Requests API.
:copyright: (c) 2012 by Kenneth Reitz.
:license: Apache2, see LICENSE for more details.
"""
from . import sessions
def request(method, url, **kwargs):
"""Constructs and sends a :class:`Request <Request>`.
:param method: method for the new :class:`Request` object.
:param url: URL for the new :class:`Request` object.
:param params: (optional) Dictionary, list of tuples or bytes to send
in the body of the :class:`Request`.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
:param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
:param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
:param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
to add for the file.
:param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
:param timeout: (optional) How many seconds to wait for the server to send data
before giving up, as a float, or a :ref:`(connect timeout, read
timeout) <timeouts>` tuple.
:type timeout: float or tuple
:param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
:type allow_redirects: bool
:param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
:param verify: (optional) Either a boolean, in which case it controls whether we verify
the server's TLS certificate, or a string, in which case it must be a path
to a CA bundle to use. Defaults to ``True``.
:param stream: (optional) if ``False``, the response content will be immediately downloaded.
:param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
:return: :class:`Response <Response>` object
:rtype: requests.Response
Usage::
>>> import requests
>>> req = requests.request('GET', 'https://httpbin.org/get')
<Response [200]>
"""
# By using the 'with' statement we are sure the session is closed, thus we
# avoid leaving sockets open which can trigger a ResourceWarning in some
# cases, and look like a memory leak in others.
with sessions.Session() as session:
return session.request(method=method, url=url, **kwargs)
def get(url, params=None, **kwargs):
r"""Sends a GET request.
:param url: URL for the new :class:`Request` object.
:param params: (optional) Dictionary, list of tuples or bytes to send
in the body of the :class:`Request`.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
kwargs.setdefault('allow_redirects', True)
return request('get', url, params=params, **kwargs)
def options(url, **kwargs):
r"""Sends an OPTIONS request.
:param url: URL for the new :class:`Request` object.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
kwargs.setdefault('allow_redirects', True)
return request('options', url, **kwargs)
def head(url, **kwargs):
r"""Sends a HEAD request.
:param url: URL for the new :class:`Request` object.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
kwargs.setdefault('allow_redirects', False)
return request('head', url, **kwargs)
def post(url, data=None, json=None, **kwargs):
r"""Sends a POST request.
:param url: URL for the new :class:`Request` object.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) json data to send in the body of the :class:`Request`.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
return request('post', url, data=data, json=json, **kwargs)
def put(url, data=None, **kwargs):
r"""Sends a PUT request.
:param url: URL for the new :class:`Request` object.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) json data to send in the body of the :class:`Request`.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
return request('put', url, data=data, **kwargs)
def patch(url, data=None, **kwargs):
r"""Sends a PATCH request.
:param url: URL for the new :class:`Request` object.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) json data to send in the body of the :class:`Request`.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
return request('patch', url, data=data, **kwargs)
def delete(url, **kwargs):
r"""Sends a DELETE request.
:param url: URL for the new :class:`Request` object.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
return request('delete', url, **kwargs)
常用的get、post、put、optins、delete方法都在该文件中实现,这些方法都是使用内部封装的一个模块:request,而request是对session.request内部模块的封装,提供一个上下文管理。
继续看最为核心的session.request模块源码:
def request(self, method, url,
·······
# Create the Request.
req = Request(
method=method.upper(),
url=url,
headers=headers,
files=files,
data=data or {},
json=json,
params=params or {},
auth=auth,
cookies=cookies,
hooks=hooks,
)
prep = self.prepare_request(req)
proxies = proxies or {}
settings = self.merge_environment_settings(
prep.url, proxies, stream, verify, cert
)
# Send the request.
send_kwargs = {
'timeout': timeout,
'allow_redirects': allow_redirects,
}
send_kwargs.update(settings)
resp = self.send(prep, **send_kwargs)
return resp
在这里提交过来的请求信息将组装成Request请求对象,并对其中的配置参数进行合并,然后将Request请求和配置参数发送给self.send,来请求下载,继续看self.send
def send(self, request, **kwargs):
"""Send a given PreparedRequest.
:rtype: requests.Response
"""
# Set defaults that the hooks can utilize to ensure they always have
# the correct parameters to reproduce the previous request.
kwargs.setdefault('stream', self.stream)
kwargs.setdefault('verify', self.verify)
kwargs.setdefault('cert', self.cert)
kwargs.setdefault('proxies', self.proxies)
# It's possible that users might accidentally send a Request object.
# Guard against that specific failure case.
if isinstance(request, Request):
raise ValueError('You can only send PreparedRequests.')
# Set up variables needed for resolve_redirects and dispatching of hooks
allow_redirects = kwargs.pop('allow_redirects', True)
stream = kwargs.get('stream')
hooks = request.hooks
# Get the appropriate adapter to use
adapter = self.get_adapter(url=request.url)
# Start time (approximately) of the request
start = preferred_clock()
# Send the request
r = adapter.send(request, **kwargs)
# Total elapsed time of the request (approximately)
elapsed = preferred_clock() - start
r.elapsed = timedelta(seconds=elapsed)
# Response manipulation hooks
r = dispatch_hook('response', hooks, r, **kwargs)
# Persist cookies
if r.history:
# If the hooks create history then we want those cookies too
for resp in r.history:
extract_cookies_to_jar(self.cookies, resp.request, resp.raw)
extract_cookies_to_jar(self.cookies, request, r.raw)
# Redirect resolving generator.
gen = self.resolve_redirects(r, request, **kwargs)
# Resolve redirects if allowed.
history = [resp for resp in gen] if allow_redirects else []
# Shuffle things around if there's history.
if history:
# Insert the first (original) request at the start
history.insert(0, r)
# Get the last request made
r = history.pop()
r.history = history
# If redirects aren't being followed, store the response on the Request for Response.next().
if not allow_redirects:
try:
r._next = next(self.resolve_redirects(r, request, yield_requests=True, **kwargs))
except StopIteration:
pass
if not stream:
r.content
return r
当然在self.send中核心的是下面几行行代码:
# Start time (approximately) of the request
start = preferred_clock()
# Send the request
r = adapter.send(request, **kwargs)
# Total elapsed time of the request (approximately)
elapsed = preferred_clock() - start
r.elapsed = timedelta(seconds=elapsed)
# Response manipulation hooks
r = dispatch_hook('response', hooks, r, **kwargs)
如果还有问题未能得到解决,搜索887934385交流群,进入后下载资料工具安装包等。最后,感谢观看!
分别进行请求,并将请求响应内容构造成响应对象r,其中又引入本地模块adapter,该模块主要负责请求处理及其响应内容。
requests库实现很巧妙,对cookie保持、代理问题、SSL验证问题都做了处理,功能很全,其中细节不仔细去研读很难理解,这里只是对其实现过程做一个浅析,如果有感兴趣的同学,可以仔细研读每个模块和功能,其中有奥妙。
requests库核心API源码分析的更多相关文章
- Spring Security(四) —— 核心过滤器源码分析
摘要: 原创出处 https://www.cnkirito.moe/spring-security-4/ 「老徐」欢迎转载,保留摘要,谢谢! 4 过滤器详解 前面的部分,我们关注了Spring Sec ...
- MapReduce新版客户端API源码分析
使用MapReduce新版客户端API提交MapReduce Job需要使用 org.apache.hadoop.mapreduce.Job 类.JavaDoc给出以下使用范例. // Create ...
- Android核心类源码分析
Handler流程1.首先Looper.prepare()在本线程中保存一个Looper实例,然后该实例中保存一个MessageQueue对象:因为Looper.prepare()在一个线程中只能调用 ...
- ABP源码分析二十六:核心框架中的一些其他功能
本文是ABP核心项目源码分析的最后一篇,介绍一些前面遗漏的功能 AbpSession AbpSession: 目前这个和CLR的Session没有什么直接的联系.当然可以自定义的去实现IAbpSess ...
- Java源码分析之LinkedList
LinkedList与ArrayList正好相对,同样是List的实现类,都有增删改查等方法,但是实现方法跟后者有很大的区别. 先归纳一下LinkedList包含的API 1.构造函数: ①Linke ...
- Android base-adapter-helper 源码分析与扩展
转载请标明出处:http://blog.csdn.net/lmj623565791/article/details/44014941,本文出自:[张鸿洋的博客] 本篇博客是我加入Android 开源项 ...
- Java Collections 源码分析
Java Collections API源码分析 侯捷老师剖析了不少Framework,如MFC,STL等.侯老师有句名言: 源码面前,了无秘密 这句话还在知乎引起广泛讨论. 我对教授程序设计的一点想 ...
- WebBench源码分析
源码分析共享地址:https://github.com/fivezh/WebBench 下载源码后编译源程序后即可执行: sudo make clean sudo make & make in ...
- redis源码分析之事务Transaction(下)
接着上一篇,这篇文章分析一下redis事务操作中multi,exec,discard三个核心命令. 原文地址:http://www.jianshu.com/p/e22615586595 看本篇文章前需 ...
随机推荐
- 使用jieba分析小说人物出现次数
分析: 1. 读取小说,以读的形式打开 with open('文件名.txt','r',encoding='utf8') as f: str = f.read() 2. 切割小说 ret = jieb ...
- Linux下基本操作
强行转Linux,开始以为会很不适应,其实还好,换汤不换药 本文只讲基本操作,足够让你愉快的打代码,想飞上天的自行百度,或找其他大神(友链) Update 6/20:由于写得太烂被学长爆踩了一顿 直接 ...
- SSM配置后可以访问静态html文件但无法访问其他后台接口的解决方案
web.xml中的一段 <servlet> <servlet-name>SpringMVC</servlet-name> <servlet-class> ...
- 感谢ZhangYu dalao回关
- Java基础系列4:抽象类与接口的前世今生
该系列博文会告诉你如何从入门到进阶,一步步地学习Java基础知识,并上手进行实战,接着了解每个Java知识点背后的实现原理,更完整地了解整个Java技术体系,形成自己的知识框架. 1.抽象类: 当编写 ...
- 百度艾尼(ERNIE)常见问题汇总及解答
一.ERNIE安装配置类问题 Q1:最适合ERNIE2.0的PaddlePaddle版本是?A1:PaddlePaddle版本建议升级到1.5.0及以上版本. Q2:ERNIE可以在哪些系统上使用?A ...
- .netcore利用DI实现订阅者模式 - xms
结合DI,实现发布者与订阅者的解耦,属于本次事务的对象主体不应定义为订阅者,因为订阅者不应与发布者产生任何关联 一.发布者订阅者模式 发布者发出一个事件主题,一个或多个订阅者接收这个事件,中间通过事件 ...
- jquery.eraser制作擦涂效果
jquery.eraser制作擦涂效果 <pre><!DOCTYPE html><html> <head> <meta http-equiv=&q ...
- Java ->在mybatis和PostgreSQL Json字段作为查询条件的解决方案
Date:2019-11-15 读前思考: 你没想到解决办法? PostgreSQL 数据库本身就支持还是另有解决办法? 说明:首先这次数据库使用到Json数据类型的原因,这次因为我们在做了一个app ...
- matlab中的eval函数使用
matlab中的eval函数使用 在matlab的命令行窗口中输入help eval命令回车就可以看到eval函数的官方解释,大概的意思就是执行matlab中的表达式,计算expression表示的代 ...