URL Parsing

【URL Parsing】

urllib.parse.urlparse(urlstring, scheme='', allow_fragments=True)

Parse a URL into six components, returning a 6-tuple. This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment. Each tuple item is a string, possibly empty. The components are not broken up in smaller parts (for example, the network location is a single string), and % escapes are not expanded. The delimiters as shown above are not part of the result, except for a leading slash in the path component, which is retained if present. For example:

Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.

The scheme argument gives the default addressing scheme.

If the allow_fragments argument is false, fragment identifiers are not recognized. Instead, they are parsed as part of the path, parameters or query component, and fragment is set to the empty string in the return value.

urllib.parse.parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')

Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a dictionary. The dictionary keys are the unique query variable names and the values are lists of values for each name.

urllib.parse.parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')

Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a list of name, value pairs.

urllib.parse.urlunparse(parts)

Construct a URL from a tuple as returned by urlparse(). The parts argument can be any six-item iterable.

urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)

This is similar to urlparse(), but does not split the params from the URL.

urllib.parse.urlunsplit(parts)

Combine the elements of a tuple as returned by urlsplit() into a complete URL as a string. The parts argument can be any five-item iterable.

urllib.parse.urldefrag(url)

If url contains a fragment identifier, return a modified version of url with no fragment identifier, and the fragment identifier as a separate string. If there is no fragment identifier in url, return url unmodified and an empty string.

URL Parsing的更多相关文章

w3 parse a url
最新链接:https://www.w3.org/TR/html53/ 2.6 URLs — HTML5 li, dd li { margin: 1em 0; } dt, dfn { font-wei ...
GO语言的开源库
Indexes and search engines These sites provide indexes and search engines for Go packages: godoc.org ...
Python Web Crawler
Python版本:3.5.2 pycharm URL Parsing¶ https://docs.python.org/3.5/library/urllib.parse.html?highlight= ...
Curl的编译
下载 curl的官网:https://curl.haxx.se/ libcurl就是一个库,curl就是使用libcurl实现的. curl是一个exe,也可以说是整个项目的名字,而libcurl就是 ...
Go语言(golang)开源项目大全
转http://www.open-open.com/lib/view/open1396063913278.html内容目录Astronomy构建工具缓存云计算命令行选项解析器命令行工具压缩配置文件解析 ...
Go by Example
Go by Example Go is an open source programming language designed for building simple, fast, and reli ...
Node.js API
Node.js v4.4.7 Documentation(官方文档) Buffer Prior to the introduction of TypedArray in ECMAScript 2015 ...
[转]Go语言(golang)开源项目大全
内容目录 Astronomy 构建工具缓存云计算命令行选项解析器命令行工具压缩配置文件解析器控制台用户界面加密数据处理数据结构数据库和存储开发工具分布式/网格计算文档编辑 ...
python爬虫从入门到放弃（三）之 Urllib库的基本使用
官方文档地址:https://docs.python.org/3/library/urllib.html 什么是Urllib Urllib是python内置的HTTP请求库包括以下模块urllib.r ...

随机推荐

C++ Primer : 第十三章 : 动态内存管理类
/* StrVec.h */ #ifndef _STRVEC_H_ #define _STRVEC_H_ #include <memory> #include <string> ...
Android——使用SQLiteDatabase操作SQLite数据库
除了可以使用文件或SharedPreferences存储数据,还可以选择使用SQLite数据库存储数据. 在Android平台上,集成了一个嵌入式关系型数据库-SQLite,SQLite3支持 NUL ...
openfire
wget http://download.igniterealtime.org/openfire/openfire-4.0.0-1.i386.rpm rpm -ivh openfire-4.0.0-1 ...
java的读文件操作
java读取文件内容,可以作如下理解: 首先获得一个文件句柄,File file = new File():file即为文件句柄.两人之间联通电话网络了,就可以开始打电话了. 通过这条线路读取甲方的信 ...
C#：org.in2bits.MyXls 文本格式日期转换,以及设置单元格格式，保留两位小数点
org.in2bits.MyXls Excel导入日期格式的处理表格内容为 2014-7-22 ,导入后显示为 41842 等于一个数值,根本不是日期,后来百度了一下,发现要做如下处理: stri ...
微信公众号开发中遇到的几个bug
一.测试自定义菜单接口时中文菜单名显示为null 设置的中文菜单名,中文未经过编码和解码过程,设置的中文菜单名在最后的微信服务器返回的json格式数据中显示为null. 解决办法:将中文先用uneco ...
python---pymysql
pymysql是Python中操作MySQL的模块,其使用方法和MySQLdb几乎相同.2.7用MySQLdb,3.0用pymysql. #下载安装 pip3 install pymysql 使用执 ...
ADF_Database Develop系列3_设计数据库表之Reconcile Database/Reverse Objects
2013-05-01 Created By BaoXinjian
HTC Vive开发笔记之UI Guideline
本文转自HTC官方论坛,原址https://www.htcvive.com/cn/forum/chat.php?mod=viewthread&tid=1641&extra=page=1 ...
Java 开发必会的 Linux 命令
作为一个Java开发人员,有些常用的Linux命令必须掌握.即时平时开发过程中不使用Linux(Unix)或者mac系统,也需要熟练掌握Linux命令.因为很多服务器上都是Linux系统.所以,要和服 ...

URL Parsing

URL Parsing的更多相关文章

随机推荐

热门专题