Python_正则表达式二

 '''

 正则表达式对象的sub(repl,string[,count=0])和subn(repl,string[,count=0])方法用来实现字符串替换功能

 '''

 example='''Beautiful is better than ugly.

 Explicit is better than implicit.

 Simple is better tha complex.

 Complext is better than nested.

 Sparse is better than dense.

 Readability counts.

 '''

 pattern = re.compile(r'\bb\w*\b',re.I) #正则表达式对象，匹配以b或B开头的单词

 print(pattern.sub('*',example)) #将符合条件的单词替换为*

 # * is * than ugly.

 # Explicit is * than implicit.

 # Simple is * tha complex.

 # Complext is * than nested.

 # Sparse is * than dense.

 # Readability counts.

 print(pattern.sub('*',example,1))   #只替换1次

 # * is better than ugly.

 # Explicit is better than implicit.

 # Simple is better tha complex.

 # Complext is better than nested.

 # Sparse is better than dense.

 # Readability counts.

 print(re.compile(r'\bb\w*\b'))  #匹配以字母b开头的单词

 print(pattern.sub('*',example,1))   #将符合条件的单词替换为*，只替换1次

 # * is better than ugly.

 # Explicit is better than implicit.

 # Simple is better tha complex.

 # Complext is better than nested.

 # Sparse is better than dense.

 # Readability counts.

 '''

 正则表达式对象呢的split(strign[,maxsplit = 0])方法用来实现字符串分隔.

 '''

 example = r'one,two,three.four/five\six?seven[eight]nine|ten'

 pattern = re.compile(r'[,./\\?[\]\|]')  #指定多个可能的分隔符

 print(pattern.split(example))

 # ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine', 'ten']

 example = r'one1two2three3four4five5six6seven7enght8nine9ten'

 pattern=re.compile(r'\d+')

 print(pattern.split(example))

 # ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'enght', 'nine', 'ten']

 example = r'one two     three   four,five.six.seven,enght,nine9ten'

 pattern=re.compile(r'[\s,.\d]+')    #允许分隔符重复

 print(pattern.split(example))

 ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'enght', 'nine', 'ten']

 '''

 match对象：

     正则表达式模块或正则表达式对象的match()方能发和search()方法匹配成功后都会返回math()对象。match对象的主要方法有grou()（返回匹配的

 一个或多个子模式内容）、groups()(返回一个包含匹配的所有子模式内容的元组)、groupdict()（返回包含匹配的所有命名子模式内容字典）、start()

 (返回指定子模式内容的起始位置)、end()(返回指定子模式内容的结束位置的前一个位置)、span()(返回一个包含指定子模式内容起始位置和结束前一个位置

 的元组)等。下面的代码使用几种不同的方法来删除字符串中指定的内容：

 '''

 email='tony@tiremove_thisger.net'

 m=re.search('remove_this',email)    #使用search()方法返回的match对象

 print(email[:m.start()]+email[m.end()])    #字符串切片

 print(re.sub('remove_this','',email))  #直接使用re模块的sub()方法

 # tony@tiger.net

 print(email.replace('remove_this','')) #也可以直接使用字符串替换方法

 # tony@tiger.net

 m=re.match(r"(\w+)(\w+)","Isaac Newton,physicist")

 print(m.group(0))    #返回整个模式内容

 # Isaac

 print(m.group(1))   #返回第一个子模式内容

 # Isaa

 print(m.group(2))

 # c

 print(m.group(1,2))

 # ('Isaa', 'c')

 '''

 下面的代码演示了子模式扩展语法的用法

 '''

 m=re.match(r"(?P<first_name>\w+)(?P<last_name>\w+)","Malcolm Reynolds")

 print(m.group('first_name'))   #使用命名的子模式

 # Malcolm

 print(m.group('last_name'))

 # m

 m=re.match(r'(\d+)\.(\d+)','24.1632')

 print(m.groups())   #返回所有匹配的子模式(不包括第0个)

 # ('24', '1632')

 m=re.match(r'(?P<first_name>\w+)(?P<last_name>\w+)','Malcolm Reynolds')

 print(m.groupdict())    #以字典形式返回匹配的结果

 # {'first_name': 'Malcol', 'last_name': 'm'}

 exampleString = '''There should be one-and preferably only one-obvious way to do it.

 Although that way may not be obvioud at first unless you're Dutch.

 Now is better than never.

 Athought never is often better than right now.

 '''

 pattern =re.compile(r'(?<=\w\s)never(?=\s\w)')  #查找不在橘子开头和结尾的never

 matchResult = pattern.search(exampleString)

 print(matchResult.span())

 # (168, 173)

 pattern =re.compile(r'(?<=\w\s)never')  #查找位于句子末尾的单词

 mathResult=pattern.search(exampleString)

 print(mathResult.span())

 # (152, 157)

 pattern=re.compile(r'(?:is\s)better(\sthan)')   #查找前面是is的better than组合

 matchResult=pattern.search(exampleString)

 print(matchResult.span())

 # (137, 151)

 print(matchResult.group(0))

 # is better than

 print(matchResult.group(1))

 # than

 pattern=re.compile(r'\b(?i)n\w+\b') #查找以n或N字母开头的所有单词

 index=0

 while True:

     matchResult=pattern.search(exampleString,index)

     if not matchResult:

         break

     print(matchResult.group(0),':',matchResult.span(0))

     index=matchResult.end(0)

 # not : (88, 91)

 # Now : (133, 136)

 # never : (152, 157)

 # never : (168, 173)

 # now : (201, 204)

 pattern=re.compile(r'(?<!not\s)be\b')   #查找前面没有单词not的单词be

 index=0

 while True:

     matchResult=pattern.search(exampleString,index)

     if not matchResult:

         break

     print(matchResult.group(0),':',matchResult.span(0))

     index=matchResult.end(0)

 # be : (13, 15)

 print(exampleString[13:20] )   #验证一下结果是否准确

 # be one-

 pattern=re.compile(r'(\b\w*)(?P<f>\w+)(?P=f)\w*\b') #匹配有连续想念痛字母的单词

 index = 0

 while True:

     matchResult=pattern.search(exampleString,index)

     if not matchResult:

         break

     print(matchResult.group(0),':',matchResult.group(2))

     index=matchResult.end(0)+1

 # unless : s

 # better : t

 # better : t

 print(s)

 # aaa    bb   c  d  e  fff

 p=re.compile(r'(\b\w*(?P<f>\w+)(?P=f)\w*\b)')

 print(p.findall(s))

 [('aaa', 'a'), ('bb', 'b'), ('fff', 'f')]

Python_正则表达式二的更多相关文章

[.net 面向对象程序设计进阶] (3) 正则表达式 (二) 高级应用
[.net 面向对象程序设计进阶] (2) 正则表达式 (二) 高级应用上一节我们说到了C#使用正则表达式的几种方法(Replace,Match,Matches,IsMatch,Split等),还 ...
java基础---->java中正则表达式二
跟正则表达式相关的类有:Pattern.Matcher和String.今天我们就开始Java中正则表达式的学习. Pattern和Matcher的理解一.正则表达式的使用方法一般推荐使用的方式如下 ...
第五篇、javascript正则表达式二
一.内容概要 1)创建着呢规则表达式对象的两种方法 2)正则表达式的常用属性和方法 3)string对象常用方法中可以使用正则表达式 4)ES中其他预定义的对象:Math.Date.Number.Bo ...
Python for Informatics 第11章正则表达式二（译）
注:文章原文为Dr. Charles Severance 的 <Python for Informatics>.文中代码用3.4版改写,并在本机测试通过. 11.1 正则表达式的字符匹配 ...
javascript 正则表达式(二）
/* 正则表达式方法:test(),exec(),String对象方法:match(),search(),replace(),split() 1.test()方法: 用法: regexp对象实例.t ...
python中关于正则表达式二
2.2 反向引用 \1, \2... 表达式在匹配时,表达式引擎会将小括号 "( )" 包含的表达式所匹配到的字符串记录下来.在获取匹配结果的时候,小括号包含的表达式所匹配到的字符 ...
javascript正则表达式(二)——方法
正则表达式规则见:http://www.cnblogs.com/wishyouhappy/p/3756812.html,下面说明相关方法 String相关方法概括: search() replace ...
python_正则表达式
re.match函数 re.match 尝试从字符串的起始位置匹配一个模式,如果不是起始位置匹配成功的话,match()就返回none. 函数语法: \[re.match(pattern, strin ...
Python_正则表达式一
''' 常用的正则表达式元字符 . 匹配换行符以外的任意单个字符 * 匹配位于'*'之前的字符或子模的0次或多次出现 + 匹配位于'+'之前的字符或子模式的1次或多次出现 - 用在[]之内用来表示范围 ...

随机推荐

android数据保存之greendao
有时我们的数据属于保存到数据库,对于Android应用和IOS应用,我们一般都会使用SQLite这个嵌入式的数据库作为我们保存数据的工具.由于我们直接操作数据库比较麻烦,而且管理起来也非常的麻烦,以前 ...
深度剖析linux内核万能--双向链表,Hash链表模版
我们都知道,链表是数据结构中用得最广泛的一种数据结构,对于数据结构,有顺序存储,数组就是一种.有链式存储,链表算一种.当然还有索引式的,散列式的,各种风格的说法,叫法层出不穷,但是万变不离其中,只要知 ...
网站开发进阶(二十二)HTML UI知识汇总(更新中...)
HTML知识汇总(更新中...) 1.<iframe> 标签浏览器支持所有浏览器都支持 <iframe> 标签. 定义和用法 iframe 元素会创建包含另外一个文档的内联 ...
web容器的会话机制
基本所有web应用开发的朋友都很熟悉session会话这个概念,在某个特定时间内,我们说可以在一个会话中存储某些状态,需要的时候又可以把状态取出来,这整个过程的时间空间可以抽象成"会话&qu ...
utl_file包的使用
首先看一下oracle 脚本 /* # $Header: HTMomse12.sql 12.0.4 20121015 Support $ #+============================= ...
Mac OS X汇编语言常识
首先OS X的syscall表位置在 /usr/include/sys/syscall.h
Apache Kafka简介与安装(二)
Kafka在Windows环境上安装与运行简介 Apache kafka 是一个分布式的基于push-subscribe的消息系统,它具备快速.可扩展.可持久化的特点.它现在是Apache旗下的一个 ...
终结python协程----从yield到actor模型的实现
把应用程序的代码分为多个代码块,正常情况代码自上而下顺序执行.如果代码块A运行过程中,能够切换执行代码块B,又能够从代码块B再切换回去继续执行代码块A,这就实现了协程我们知道线程的调度(线程上下文切 ...
mongodb3.6 （四）net 客户端如何连接、访问mongodb集群
前言在是一篇文章mongodb如何做数据备灾中已经介绍mongodb集群是如何工作,可能很多人都有这样一个疑问:客户端如何知道主服务挂了呢?这一篇文章将介绍如何在net中访问这个集群. 第一步.安 ...
rcp perspective 添加9个视图均匀排列
PageLayout布局方法 pageLayout.addView("NodePMTSStream" + ":1", IPageLayout.TOP, 0.5f ...

Python_正则表达式二

Python_正则表达式二的更多相关文章

随机推荐

热门专题