URL Parsing

urllib.parse.urlparse(urlstring, scheme='', allow_fragments=True)

Parse a URL into six components, returning a 6-tuple. This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment. Each tuple item is a string, possibly empty. The components are not broken up in smaller parts (for example, the network location is a single string), and % escapes are not expanded. The delimiters as shown above are not part of the result, except for a leading slash in the path component, which is retained if present. For example:


Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.


The scheme argument gives the default addressing scheme.

If the allow_fragments argument is false, fragment identifiers are not recognized. Instead, they are parsed as part of the path, parameters or query component, and fragment is set to the empty string in the return value.

urllib.parse.parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')

Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a dictionary. The dictionary keys are the unique query variable names and the values are lists of values for each name.

urllib.parse.parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')

Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a list of name, value pairs.

urllib.parse.urlunparse(parts)

Construct a URL from a tuple as returned by urlparse(). The parts argument can be any six-item iterable.

urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)

This is similar to urlparse(), but does not split the params from the URL.

urllib.parse.urlunsplit(parts)

Combine the elements of a tuple as returned by urlsplit() into a complete URL as a string. The parts argument can be any five-item iterable.

urllib.parse.urldefrag(url)

If url contains a fragment identifier, return a modified version of url with no fragment identifier, and the fragment identifier as a separate string. If there is no fragment identifier in url, return url unmodified and an empty string.

URL Parsing的更多相关文章

  1. w3 parse a url

     最新链接:https://www.w3.org/TR/html53/ 2.6 URLs — HTML5 li, dd li { margin: 1em 0; } dt, dfn { font-wei ...

  2. GO语言的开源库

    Indexes and search engines These sites provide indexes and search engines for Go packages: godoc.org ...

  3. Python Web Crawler

    Python版本:3.5.2 pycharm URL Parsing¶ https://docs.python.org/3.5/library/urllib.parse.html?highlight= ...

  4. Curl的编译

    下载 curl的官网:https://curl.haxx.se/ libcurl就是一个库,curl就是使用libcurl实现的. curl是一个exe,也可以说是整个项目的名字,而libcurl就是 ...

  5. Go语言(golang)开源项目大全

    转http://www.open-open.com/lib/view/open1396063913278.html内容目录Astronomy构建工具缓存云计算命令行选项解析器命令行工具压缩配置文件解析 ...

  6. Go by Example

    Go by Example Go is an open source programming language designed for building simple, fast, and reli ...

  7. Node.js API

    Node.js v4.4.7 Documentation(官方文档) Buffer Prior to the introduction of TypedArray in ECMAScript 2015 ...

  8. [转]Go语言(golang)开源项目大全

    内容目录 Astronomy 构建工具 缓存 云计算 命令行选项解析器 命令行工具 压缩 配置文件解析器 控制台用户界面 加密 数据处理 数据结构 数据库和存储 开发工具 分布式/网格计算 文档 编辑 ...

  9. python爬虫从入门到放弃(三)之 Urllib库的基本使用

    官方文档地址:https://docs.python.org/3/library/urllib.html 什么是Urllib Urllib是python内置的HTTP请求库包括以下模块urllib.r ...

随机推荐

  1. ODOO从哪里开始??OpenERP的第一根线头儿

    Windows下ODOO源码启动: python odoo-bin -w odoo -r odoo --addons-path=addons,../mymodules --db-filter=mydb ...

  2. pods的安装和使用

    ////  pods的安装.h//  IOS笔记 /*Cocoapods安装步骤 1.升级Ruby环境 终端输入:$gem update --system 此时会出现 ERROR:  While ex ...

  3. 论文笔记之:Generative Adversarial Text to Image Synthesis

    Generative Adversarial Text to Image Synthesis ICML 2016  摘要:本文将文本和图像练习起来,根据文本生成图像,结合 CNN 和 GAN 来有效的 ...

  4. Durid(一): 原理架构

    Durid是在2013年底开源出来的,当前最新版本0.9.2, 主要解决的是对实时数据以及较近时间的历史数据的多维查询提供高并发(多用户),低延时,高可靠性的问题.对比Druid与其他解决方案,Kyl ...

  5. [Spring MVC] - Annotation验证

    使用Spring MVC的Annotation验证可以直接对view model的简单数据验证,注意,这里是简单的,如果model的数据验证需要有一些比较复杂的业务逻辑性在里头,只是使用annotat ...

  6. STM32F4 SPI2初始化及收发数据【使用库函数】

    我的STM32F4 Discovery上边有一个加速度传感器LIS302DL.在演示工程中,ST的工程师使用这个传感器做了个很令人羡慕的东西:解算开发板的姿态.当开发板倾斜时候,处于最上边的LED点亮 ...

  7. Hibernate getCurrentSession()和openSession()的区别

    通过getCurrentSession()创建的Session会绑定到当前线程上:openSession()不会. 通过getCurrentSession()获取Session,首先是从当前上下文中寻 ...

  8. Win32API界面库 - Project wheels 工程基础部分完成

    离上次发博文过去了好久,先是要忙一个机器人的项目,然后就是部门的事情和考试周复习,然后就到了考试周,趁着复习的间隙,拾起了寒假时候抄的界面库,修掉了从前的bug. bug1 控件显示问题 当初抄这个库 ...

  9. java io流(字符流) 文件打开、读取文件、关闭文件

    java io流(字符流) 文件打开 读取文件 关闭文件 //打开文件 //读取文件内容 //关闭文件 import java.io.*; public class Index{ public sta ...

  10. iOS NSNotification的使用

    如果在一个类中想要执行另一个类中的方法可以使用通知 1.创建一个通知对象:使用notificationWithName:object: 或者 notificationWithName:object:u ...