python与正则

想了解正则的使用，请点击：正则表达式。每种编程语言有一些独特的匹配方式，python也不例外：

语法	含义	表达实例	完整匹配匹配的字符串
\A	仅匹配字符串开头	\Aabc	abc
\Z	仅匹配字符串末尾	abc\Z	abc
(?P)	分组，除了原有编号再指定一个额外的别名	(?Pabc).{2}	abcabc
(?P=name)	引用别名为的分组匹配到字符串	(?P\d)abc(?P=id)	1abc1\n5abc5

在python语句中要匹配字符\，需要在表达式中写\\\\，因为python编译需要\\表示\，同时正则也是。或者使用python原生字符串的支持，匹配一个\的正则表达式可以写成r'\\'，同样，匹配一个数字的'\\d'可以写成r'\d'，

python通过模块re提供正则表达式的支持。使用re第一步先将正则表达式的字符串形式编译为Pattern，然后使用Pattern实例处理文本并获得匹配结果，最后使用Match实例获得信息，进行其他操作。

主要用到方法如下：

re.compile(string[,flag])

re.match(pattern,string[,flags])

re.search(pattern,string[,flags])

re.split(pattern,string[,flags])

re.findall(pattern,string[,flags])\

re.finditer(pattern,string[,flags])\

re.sub(pattern,repl,string[,flags])

re.subn(pattern,repl,string[,flags])

pattern = re.compile(r'\d+')
flag 参数是代表匹配模式，取值可以使用 | 同时生效，取值如下：
- re.I :忽略大小写
- re.M: 多行模式，改变“^”和"$"的行为
- re.S 任意匹配模式，改变“.”的行为
- re.L 使预定字符类\w\W\b\B\s\S取决于当前区域的设定
- re.U 使预定字符类\w\W\b\B\s\S\d\D取决于Unicode定义的字符属性
- re.X 详细模式。这个模式下正则表达式可以是多行，忽略空白字符，并加入注释

1 re.match(pattern ,string[,flags])

这个函数从输入参数string（匹配的字符串）的开头开始匹配，尝试匹配pattern，一直向后匹配，如果遇到无法匹配的字符或者已经到达string的末尾，立即返回None

#!coding:utf-8

import re

pattern = re.compile(r'\d+')

result1 = re.match(pattern,'192abc')

if result1:

    print result1.group()

else:

    print '匹配失败'

result2 = re.match(pattern,'abc123')

if result2:

    print result1.group()

else:

    print '匹配失败'

2. re.search(pattern,string[,flags])

search方法与match方法极其类似，区别在于match函数只从string的开始位置匹配，而search会扫描整个string查找匹配，match()只有在string起始位置匹配成功的时候才会有返回，如果不是开始位置匹配成功的话，返回none。search返回对象和match返回的对象在方法和属性上一致

#!coding:utf-8

import re

pattern = re.compile(r'\d+')

result1 = re.search(pattern,'abc192abc')

if result1:

    print result1.group()

else:

    print '匹配失败'

result2 = re.search(pattern,'123abc123')

if result2:

    print result1.group()

else:

    print '匹配失败'

运行结果

C:\Python27\python.exe F:/python_scrapy/ch04/4.2.2.1.py

192

192

Process finished with exit code 0

3 re.split(pattern,string[,flags])

按照能够匹配的字符串将string分割后返回列表。maxsplit用于指定最大分割次数，不指定，则全部分割。

#!coding:utf-8

import re

pattern = re.compile(r'\d+')

print re.split(pattern,'A1B2C2De2')

运行结果

C:\Python27\python.exe F:/python_scrapy/ch04/4.2.2.3.py

['A', 'B', 'C', 'De', '']

Process finished with exit code 0

4 re.findall(pattern,string[,flags])

搜索整个string，以列表形式返回能匹配的全部字符串，

#!coding:utf-8

import re

pattern = re.compile(r'\d+')

print re.findall(pattern,'A1B2C2De2')

运行结果

C:\Python27\python.exe F:/python_scrapy/ch04/4.2.2.3.py

['1', '2', '2', '2']

Process finished with exit code 0

5 re.finditer(patttern,string[,flags])

搜索整个string，以迭代器形式返回能匹配全部Match对象，

#!coding:utf-8

import re

pattern = re.compile(r'\d+')

matchiter = re.finditer(pattern,'A1B2C2De2')

for match in matchiter:

    print match.group()

运行结果

C:\Python27\python.exe F:/python_scrapy/ch04/4.2.2.3.py

1

2

2

2

Process finished with exit code 0

6 re.sub(pattern,repl,string[,flags])

使用reply替换string中每一个匹配的字符串后返回替换后的字符串。当repl是一个字符串时，可以使用\id或\g、\g引用分组，但不能使用编号0，当repl是一个方法时，这个方法应当只接受一个参数(Match对象)，并返回一个字符串用于替换(返回的字符串中不能再引用分组）。count用于指定最多替换次数，不指定是全部替换

#!coding:utf-8

import re

p = re.compile(r'(?P<word1>\w+) (?P<word2>\w+)')#使用名称引用

s = 'i say,hello world!'

print p.sub(r'\g<word2> \g<word1>',s)#repl是一个字符串时，使用名字引用

p = re.compile(r'(\w+) (\w+)')#使用编号

print p.sub(r'\2 \1',s)#

def func(m):

    return m.group(1).title()+' '+m.group(2).title()

print p.sub(func,s)#repl是一个方法时

运行结果

C:\Python27\python.exe F:/python_scrapy/ch04/4.2.2.6.py

say i,world hello!

say i,world hello!

I Say,Hello World!

Process finished with exit code 0

7 re.subn(pattern,repl,string[,flags])

返回（sub(pattern,repl,string[,flags])）替换的次数

#!coding:utf-8

import re

p = re.compile(r'(?P<word1>\w+) (?P<word2>\w+)')#使用名称引用

s = 'i say,hello world!'

print p.subn(r'\g<word2> \g<word1>',s)#repl是一个字符串时，使用名字引用

p = re.compile(r'(\w+) (\w+)')#使用编号

print p.subn(r'\2 \1',s)#

def func(m):

    return m.group(1).title()+' '+m.group(2).title()

print p.subn(func,s)#repl是一个方法时

运行结果

C:\Python27\python.exe F:/python_scrapy/ch04/4.2.2.6.py

('say i,world hello!', 2)

('say i,world hello!', 2)

('I Say,Hello World!', 2)

Process finished with exit code 0

Match对象的属性

属性和方法	说明
Pos	搜索的开始位置
Endpos	搜索的结束位置
String	搜索的字符串
Re	当前使用的正则表达式的对象
Lastindex	最后匹配的组索引
Lastgroup	最后匹配的组名
group(index=0)	某个分组的匹配结果。如果index等于0，便是匹配整个正则表达式
groups()	所有分组的匹配结果，每个分组的结果组成一个列表返回
Groupdict()	返回组名作为key，每个分组的匹配结果座位value的字典
start([group])	获取组的开始位置
end([group])	获取组的结束位置
span([group])	获取组的开始和结束位置
expand(template)	使用组的匹配结果来替换模板template中的内容，并把替换后的字符串返回



import re

pattern = re.compile(r'(\w+) (\w+) (?P<word>.*)')

match = pattern.match( 'I love you!')

print "match.string:", match.string

print "match.re:", match.re

print "match.pos:", match.pos

print "match.endpos:", match.endpos

print "match.lastindex:", match.lastindex

print "match.lastgroup:", match.lastgroup

print "match.group(1,2):", match.group(1, 2)

print "match.groups():", match.groups()

print "match.groupdict():", match.groupdict()

print "match.start(2):", match.start(2)

print "match.end(2):", match.end(2)

print "match.span(2):", match.span(2)

print r"match.expand(r'\2 \1 \3'):", match.expand(r'\2 \1 \3')

运行结果

C:\Python27\python.exe F:/python_scrapy/ch04/4.2.2.7.py

match.string: I love you!

match.re: <_sre.SRE_Pattern object at 0x020F7890>

match.pos: 0

match.endpos: 11

match.lastindex: 3

match.lastgroup: word

match.group(1,2): ('I', 'love')

match.groups(): ('I', 'love', 'you!')

match.groupdict(): {'word': 'you!'}

match.start(2): 2

match.end(2): 6

match.span(2): (2, 6)

match.expand(r'\2 \1 \3'): love I you!

Process finished with exit code 0

python与正则的更多相关文章

python re 正则
*:first-child { margin-top: 0 !important; } body>*:last-child { margin-bottom: 0 !important; } /* ...
python 速记正则使用(转)
目录 python 速记正则使用(转) 正则表达式语法字符与字符类量词组与捕获断言与标记条件匹配正则表达式的标志 Python正则表达式模块四大功能两种方法常用方法匹配对象的属性与 ...
python的正则re模块
一. python的正则 python的正则模块re,是其内置模块,可以直接导入,即import re.python的正则和其他应用的正则及其相似,有其他基础的话,学起来还是比较简单的. 二. 正则前 ...
python - 手机号正则匹配
Python 手机号正则匹配 # -*- coding:utf-8 -*- import re def is_phone(phone): phone_pat = re.compile('^(13\d| ...
Python（正则 Time datatime os sys random json pickle模块）
正则表达式: import re #导入模块名 p = re.compile(-]代表匹配0至9的任意一个数字, 所以这里的意思是对传进来的字符串进行匹配,如果这个字符串的开头第一个字符是数字,就代表 ...
Python之正则
从学习Python至今,发现很多时候是将Python作为一种工具.特别在文本处理方面,使用起来更是游刃有余. 说到文本处理,那么正则表达式必然是一个绝好的工具,它能将一些繁杂的字符搜索或者替换以非常简 ...
转--python之正则入门
原文地址 1. 正则表达式基础 1.1. 简单介绍正则表达式并不是Python的一部分.正则表达式是用于处理字符串的强大工具,拥有自己独特的语法以及一个独立的处理引擎,效率上可能不如str自带的方法 ...
Python基础(正则、序列化、常用模块和面向对象)-day06
写在前面上课第六天,打卡: 天地不仁,以万物为刍狗: 一.正则 - 正则就是用一些具有特殊含义的符号组合到一起(称为正则表达式)来描述字符或者字符串的方法: - 在线正则工具:http://tool ...
python re正则
一:什么是正则? 正则就是用一些具有特殊含义的符号组合到一起(称为正则表达式)来描述字符或者字符串的方法.或者说:正则就是用来描述一类事物的规则.(在Python中)它内嵌在Python中,并通过 r ...
Python使用正则
Python中使用正则的两种方式在Python中有两只能够使用正则表达式的方式: 直接使用re模块中的函数 import re re_string = "{{(.*?)}}" s ...

随机推荐

Mybatis mapper.xml 配置
<?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE ...
分布式一致性Paxos算法（转载）
比较通俗易懂,可以入门,转载地址是http://www.cnblogs.com/linbingdong/p/6253479.html Paxos算法在分布式领域具有非常重要的地位.但是Paxos算法有 ...
CF888G Xor-MST[最小生成树+01trie]
前注:关于这题,本人的解法暂时没有成功通过此题,原因是被卡常了.可能需要等待某种机缘来请人调试. 类似uoj的一道题(新年的繁荣),不过是一个有些简单的版本. 因为是完全图,有没有办法明显优化建边,所 ...
php类知识---接口
<?phpinterface wenwa{ function eat();}interface duwa{ function drink();}class cpc implements duwa ...
robotframework 模拟滚动鼠标到底部
Execute Javascript var ele = document.getElementsByClassName("right_main")[0];ele.scrollTo ...
Codeforces Round #564 (Div. 1)
Codeforces Round #564 (Div. 1) A Nauuo and Cards 首先如果牌库中最后的牌是$1,2,\cdots, k$,那么就模拟一下能不能每次打出第\(k+i\ ...
Spring——多种方式实现依赖注入
在Spring的XML配置中,只有一种声明bean的方式:使用<bean>元素并指定class属性.Spring会从这里获取必要的信息来创建bean. 但是,在XML中声明DI时,会有多种 ...
JavaScript如何比较两个数组的内容是否相同
今天意外地发现JavaScript是不能用==或===操作符直接比较两个数组是否相等的. alert([]==[]); // false alert([]===[]); // false 以上两句代码 ...
python3 使用flask连接数据库出现“ModuleNotFoundError: No module named 'MySQLdb'”
本文链接:https://blog.csdn.net/Granery/article/details/89787348 在使用python3连接MySQL的时候出现了 ‘ModuleNotFoundE ...
Leetcode题目33.搜索旋转排序数组（中等）
题目描述: 假设按照升序排序的数组在预先未知的某个点上进行了旋转. ( 例如,数组 [0,1,2,4,5,6,7] 可能变为 [4,5,6,7,0,1,2] ). 搜索一个给定的目标值,如果数组中存在 ...

python与正则

想了解正则的使用，请点击：正则表达式。每种编程语言有一些独特的匹配方式，python也不例外：

在python语句中要匹配字符\，需要在表达式中写\\\\，因为python编译需要\\表示\，同时正则也是。或者使用python原生字符串的支持，匹配一个\的正则表达式可以写成r'\\'，同样，匹配一个数字的'\\d'可以写成r'\d'，

python通过模块re提供正则表达式的支持。使用re第一步先将正则表达式的字符串形式编译为Pattern，然后使用Pattern实例处理文本并获得匹配结果，最后使用Match实例获得信息，进行其他操作。

1 re.match(pattern ,string[,flags])

2. re.search(pattern,string[,flags])

3 re.split(pattern,string[,flags])

4 re.findall(pattern,string[,flags])

5 re.finditer(patttern,string[,flags])

6 re.sub(pattern,repl,string[,flags])

7 re.subn(pattern,repl,string[,flags])

Match对象的属性

python与正则的更多相关文章

随机推荐

热门专题