python——re模块（正则表达式）

re 模块的使用：

1.使用compile()函数编译一个parttern对象，例如：parttern=re.compile(r'\d+')

2.通过pattern对象提供的一系列属相和方法，对文本进行匹配查找，获得结果，即一个Match对象

match 方法：从起始位置开始查找，一次匹配，匹配失败返回None ----------> match(string[, pos[, endpos]])

m = pattern.match('one12twothree34four', 3, 10) # 从下标3开始，也即从字符串'1'的位置开始匹配，返回一个Match对象, 没有匹配到的话返回None

# -*- conding:utf-8 -*-

import re

pattern = re.compile(r'([a-z]+) ([a-z]+)', re.I)  # re.I 表示忽略大小写

m = pattern.match("hello world wide web python") 

print(m)  # <_sre.SRE_Match object; span=(0, 11), match='hello world'>

print(m.group(), type(m.group()))  # hello world <class 'str'>

print(m.group(1)) # hello

print(m.group(2)) # world

print(m.span(), type(m.span()))  # (0, 11) <class 'tuple'>

print(m.groups(), type(m.groups()))  # ('hello', 'world') <class 'tuple'>

search 方法：从任何位置开始查找，一次匹配，匹配失败返回None ----------> search(string[, pos[, endpos]]) 使用同match方法
findall 方法：全部匹配，返回列表，匹配失败返回空列表 ----------> findall(string[, pos[, endpos]])

# -*- conding:utf-8 -*-

import re

# 将正则表达式编译成pattern对象

pattern = re.compile(r'\d+')  # 查找数字

rel1 = pattern.findall('hello 123 world 456 ')

print(rel1)   # ['123', '456']

rel2 = pattern.findall('one12two23s34f45f56s78e89t10', 10, 20)  # 指定匹配的起止位置

print(rel2)  # ['34', '45', '56']

#re模块提供一个方法叫compile模块，提供我们输入一个匹配的规则

#然后返回一个pattern实例，我们根据这个规则去匹配字符串

pattern2 = re.compile(r'\d+\.\d*')

#通过partten.findall()方法就能够全部匹配到我们得到的字符串

result = pattern2.findall("123.141593, 'bigcat', 232312, 3.15")

#findall 以 列表形式 返回全部能匹配的子串给result

print(result)  # ['123.141593', '3.15']

finditer 方法：全部匹配，返回迭代器，返回Match对象 ----------> finditer(string[, pos[, endpos]])

# -*- conding:utf-8 -*-

import re

'''finditer跟findall类似'''

pattern = re.compile(r'\d+')

resl = pattern.finditer('hello-123-world-456-python-789')

print(resl)  # <callable_iterator object at 0x0000022A886FD470>

print(type(resl))  # <class 'callable_iterator'>    # 迭代器对象

for m in resl:  # m是Match对象， 具体操作见上面的match

    print(m.group())  # 分别打印出123 456 789

split 方法：分割字符串，返回列表 ----------> split(string[, maxsplit])

# -*- conding:utf-8 -*-

import re

'''split方法按照规则将字符串分割后返回列表'''

p = re.compile(r'[\s\,;\t\n]+')

print(p.split('  a  ,    bwf  ;; c '))   # ['', 'a', 'bwf', 'c', '']

sub 方法：替换 ----------> sub(repl, string[, count])

# -*- conding:utf-8 -*-

import re

p = re.compile(r'(\w+) (\w+)')

s = 'hello 1236 hello 456'

print(p.sub('hello world', s))  # hello world hello world

3.使用match对象的属相和方法获取信息

match.group()

match.groups() # 匹配的所有等同于 match.group()等同于match.group(0)

match.start() # 开始位置

match.end() # 结束位置

match.span() # 返回开始结束的区域跨度

4、匹配中文

中文的Unicode编码范围主要在[u4e00-u9fa5]，没有包括全角中文标点，不过大部分情况下是够用了

# -*- conding:utf-8 -*-

import re

title = '你好，python ， 你好，世界 hello world'

pa = re.compile(r'[\u4e00-\u9fa5]+')

t = pa.findall(title)

print(t)   # ['你好', '你好', '世界']

5、贪婪匹配-------非贪婪匹配：python默认是贪婪匹配

　　贪婪匹配：在匹配成功的前提下，尽可能多的匹配（*）

　　非贪婪匹配：在匹配成功的前提下，尽可能少的匹配（?）

# -*- conding:utf-8 -*-

import re

s = 'abbbbbbdsddbbbb'

res = re.findall('ab*', s)  # *号是匹配前一个字符0次或无限次

print(res)  # ['abbbbbb']  匹配ab后已经匹配成功，但是由于是贪婪匹配，所以会继续往后尝试匹配

res2 = re.findall('ab*?', s)

print(res2)  # ['a']  匹配a成功后，由于是非贪婪匹配，所以匹配就结束了

加油，一步一步往下走，坚持下去，自己给自己打气加油，workon

python——re模块（正则表达式）的更多相关文章

Python re模块正则表达式
1 简介就其本质而言,正则表达式(或 RE)是一种小型的.高度专业化的编程语言,(在Python中)它内嵌在Python中,并通过 re 模块实现.正则表达式模式被编译成一系列的字节码,然后由用 C ...
Python re模块正则表达式
Python::re 模块 -- 在Python中使用正则表达式
前言这篇文章,并不是对正则表达式的介绍,而是对Python中如何结合re模块使用正则表达式的介绍.文章的侧重点是如何使用re模块在Python语言中使用正则表达式,对于Python表达式的语法和详细 ...
python的re正则表达式模块学习
python中re模块的用法 Python 的 re 模块(Regular Expression 正则表达式)提供各种正则表达式的匹配操作,在文本解析.复杂字符串分析和信息提取时是一个非常有用的工 ...
Python之re模块 —— 正则表达式操作
这个模块提供了与 Perl 相似l的正则表达式匹配操作.Unicode字符串也同样适用. 正则表达式使用反斜杠" \ "来代表特殊形式或用作转义字符,这里跟Python的语法冲突, ...
Python中的re模块--正则表达式
Python中的re模块--正则表达式使用match从字符串开头匹配以匹配国内手机号为例,通常手机号为11位,以1开头.大概是这样13509094747,(这个号码是我随便写的,请不要拨打),我们 ...
python常用模块（1）：collections模块和re模块（正则表达式详解）
从今天开始我们就要开始学习python的模块,今天先介绍两个常用模块collections和re模块.还有非常重要的正则表达式,今天学习的正则表达式需要记忆的东西非常多,希望大家可以认真记忆.按常理来 ...
python模块 re模块与python中运用正则表达式的特点模块知识详解
1.re模块和基础方法 2.在python中使用正则表达式的特点和问题 3.使用正则表达式的技巧 4.简单爬虫例子一.re模块模块引入; import re 相关知识: 1.查找: (1)find ...
Python全栈正则表达式（概念、、语法、元字符、re模块）
前言: 普通人有三件东西看不懂:医生的处方,道士的鬼符,程序员得正则表达式什么是正则表达式? 正则表达式,又称规则表达式,英文名为Regular Expression,在代 ...
【Python开发】Python之re模块 —— 正则表达式操作
Python之re模块 -- 正则表达式操作这个模块提供了与 Perl 相似l的正则表达式匹配操作.Unicode字符串也同样适用. 正则表达式使用反斜杠" \ "来代表特殊形式 ...

随机推荐

10.12NOIP模拟题(1)
#include<iostream> #include<cstdio> #include<cstring> #include<queue> #defin ...
java 序列化和反序列化数据
使用ObjectOutputStream 序列号原始数据和对象数据,使用ObjectInputStream 反序列化使用字节存储数据,可以将序列化的数据存储到硬盘上,或输出到网络上 package ...
二分搜索 POJ 3273 Monthly Expense
题目传送门 /* 题意:分成m个集合,使最大的集合值(求和)最小二分搜索:二分集合大小,判断能否有m个集合. */ #include <cstdio> #include <algo ...
Android 性能优化（11）网络优化（ 7）Optimizing for Doze and App Standby
Optimizing for Doze and App Standby In this document Understanding Doze Doze restrictions Adapting y ...
Storm概念学习系列之核心概念（Tuple、Spout、Blot、Stream、Stream Grouping、Worker、Task、Executor、Topology）（博主推荐）
不多说,直接上干货! 以下都是非常重要的storm概念知识. (Tuple元组数据载体 .Spout数据源.Blot消息处理者.Stream消息流和 Stream Grouping 消息流组.Wor ...
网上商城 Incorrect datetime value: '' for column 'ordertime' at row 1
今天在做商城项目的[提交订单]功能的时候,向数据库插入数据报错:Incorrect datetime value: '' for column 'ordertime' at row 1 public ...
LN : leetcode 399 Evaluate Division
lc 399 Evaluate Division 399 Evaluate Division Equations are given in the format A / B = k, where A ...
scala-基础-映射(1)
//映射(1)-构建,获取,更新,迭代,反转,映射(可变与不可变互换) class Demo1 extends TestCase { //构建与获取 def test_create_^^(){ // ...
Elasticsearch--扩展索引结构
目录索引树形数据索引非扁平数据索引关系型数据使用嵌套对象评分与嵌套查询使用主从关系索引树形数据使用path_analyzer分析树形数据字段索引非扁平数据数据如下: { " ...
opencv边缘滤波
2018-03-0422:16:11 import cv2 as cv import numpy as np def bi_demo (image): print ("ceshi" ...

python——re模块（正则表达式）

python——re模块（正则表达式）的更多相关文章

随机推荐

热门专题