？分割、匹配、搜索时可以用到什么样的解决方法？

分割方法总结

1. str.split( )

* 分割字符串

* 返回列表

s1='I  love  python'

# 默认以空格为界定符，且多个空格都当做一个处理

print(s1.split())

['I', 'love', 'python']

# (s1中有两个空格)如果这是指定了空格为界定符，则会有其中一个空格会被当做字符输出

print(s1.split(' '))

['I', '', 'love', '', 'python']

# 可指定任意字符/字符串作为界定符

print(s1.split('o'))

['I  l', 've  pyth', 'n']

# maxsplit=n，指定分割n次

print(s1.split(maxsplit=1))

['I', 'love  python']

2. re.split()

* 可定义多个界定符

import re

line = 'asdf fjdk; afed, fjek,asdf, foo'

# 可指定多个字符作为界定符

print(re.split(r'[;,\s]\s*',line))

['asdf', 'fjdk', 'afed', 'fjek', 'asdf', 'foo']

# 加一个括号表示捕获分组

print(re.split(r'(;|,|\s)\s*',line))

['asdf', ' ', 'fjdk', ';', 'afed', ',', 'fjek', ',', 'asdf', ',', 'foo']

# (?:)强调为非捕获分组

print(re.split(r'(?:,|;|\s)\s*',line))

['asdf', 'fjdk', 'afed', 'fjek', 'asdf', 'foo']

搜索和匹配方法总结

1. str.startswith() | str.endswith()

* 开头/结尾匹配
* 返回True/False
* 常用于“判断文件夹中是否存在指定文件类型”、“URL”

url="http://www.python.org"

# startswith('string')判断是否以string开头

print(url.startswith('http'))

True

# endswith('string')判断是否以string结尾

print(url.endswith('com'))

False

# startswith('string',n,m) 可指定索引范围n-m

print(url.endswith('n',11,17))

True

# 要注意一个特性，传递给startswith/endswith处理的只能是tuple，不能是list

choices=['http:','ftp:']

print(url.startswith(choices))TypeError: startswith first arg must be str or a tuple of str, not list

print(url.startswith(tuple(choices)))

True

# endswith()，应用在检索/判断，一个目录中是否有某一类型结尾的文件

import os

filenames=os.listdir('/test')

#Example-1

print(filenames)

['aa', 'zhuabao', '.python-version', 'test.sh', 'hh.c', '.test.py.swp', 'zhuabao2', 'abc', 'linshi.sh']

print([candsh for candsh in filenames if candsh.endswith(('.sh','.c'))])

['test.sh', 'hh.c', 'linshi.sh']

#Example-2

if any(name.endswith(('.sh','.c')) for name in os.listdir('/test')):

    print('have')

have

2. fnmatch() | fnmatchcase()

* 使用Shell通配符匹配

3. str.find()

* 返回索引

4. re.match(r'')

* 使用正则表达式匹配

* 只检查字符串开始位置

5. re.findall(r'')

* 从任意位置开始匹配
* 以列表方式返回

6. re.finditer(r'')

* 以迭代方式返回

7. r' $'——>正则表达式以$结尾

* 确保精确

8. re.compile(r'')——>先编译正则表达式

* 做多次/大量的匹配和搜索操作时

import re

text1='2017/07/26'

text2='Nov 27,2012'

text3='Today is 11/27/2012. PyCon starts 3/13/2013.'

text5='26/07/2017 is today,PyCon starts 3/13/2013.'

# 编译一个匹配 m/y/d/格式的正则表达式

datepat=re.compile(r'\d+/\d+/\d+')

# re.match('string')实现在string中搜索

print(datepat.match(text1))

<_sre.SRE_Match object; span=(0, 10), match='2017/07/26'>

print(datepat.match(text2))

None

# 我们发现re.match() 只能实现从开始位置搜索，也只能搜索出开头的第一个匹配项

print(datepat.match(text3))

None

print(datepat.match(text5))

<_sre.SRE_Match object; span=(0, 10), match='26/07/2017'>

# 这种情况有时可能得不到我们想要的结果，一种情况是可以在末尾加$，实现精确匹配

text6='26/07/2017abcdef'

datepat1=re.compile(r'\d+/\d+/\d+')

print(datepat1.match(text6))

<_sre.SRE_Match object; span=(0, 10), match='26/07/2017'>

datepat2=re.compile(r'\d+/\d+/\d+$')

print(datepat2.match(text6))

None

# 另一种情况是可以使用考虑使用re.findall('string') 可在string中的全部位置进行搜索

print(datepat.findall(text3))

['11/27/2012', '3/13/2013']

# re.findall返回列表，re.finditer()返回迭代对象

for m in datepat.finditer(text5):

    print(m.groups())

# # 捕获分组 # #

datepat=re.compile(r'(\d+)/(\d+)/(\d+)')

m=datepat.match(text1)

print(m.group(0))

2017/07/26

print(m.group(1))

print(m.group(2))

print(m.group(3))

print(m.groups())

('2017', '07', '26')

for month,day,year in datepat.findall(text3):

    print('{}-{}-{}'.format(year,month,day))

012-11-272013-3-13

9. ?修饰符

* 将贪婪匹配变为非贪婪匹配

* 从而实现最短匹配模式

text6 = 'Computer says "no." Phone says "yes."'

pat1=re.compile(r'\"(.*)\"')  #匹配冒号包含的文本

print(pat1.findall(text6))

['no." Phone says "yes.']

pat2=re.compile(r'\"(.*?)\"') #增加 ?修饰符

print(pat2.findall(text6))

['no.', 'yes.']

10. （? : . | \n） | re.DOTALL

* 使得（.）能够匹配包括换行符在内的所有字符

* 从而实现多行匹配模式

text7=''' /*this is a

multiline comment*/

'''


pat1=re.compile(r'/\*(.*?)\*/')

print(pat1.findall(text7))

[]                                      #为什么没匹配出来，因为(.)并不能匹配换行符

pat2=re.compile(r'/\*((?:.|\n)*?)\*/')  #把(.) ——> (?:.|\n)

print(pat2.findall(text7))

['this is a\nmultiline comment']

# re.DOTALL可以让正则表达式中的点(.)匹配包括换行符在内的任意字符

pat3=re.compile(r'/\*(.*?)\*/',re.DOTALL)

print(pat3.findall(text7))

['this is a\nmultiline comment']

搜索和替换方法总结

1. str.replace()

# S.replace(old, new[, count]) -> str

text5="a b c d e e e"

print(text5.replace("e","a"))

# a b c d a a a

print(text5.replace("e","a",2))

# a b c d a a e

2. re.sub() | re.(flags=re.IGNORECASE)

* 匹配并替换 | 忽略大小写匹配

# sub(pattern, repl, string, count=0, flags=0)

# 第1个参数：匹配什么

# 第2个参数：替换什么

# 第3个参数：处理的文本

# 第4个参数：替换次数

text1="l o v e"

print(re.sub(r'\s','-',text1))

# l-o-v-e

print(re.sub(r'\s','-',text1,count=1))

# l-o v e

# flags=re.IGNORECASE 忽略大小写

text3 = 'UPPER PYTHON, lower python, Mixed Python'

print(re.sub('python','snake',text3,flags=re.IGNORECASE))

# UPPER snake, lower snake, Mixed snake

# 如果想替换字符跟匹配字符的大小写保持一致，我们需要一个辅助函数

def matchcase(word):

    def replace(m):

        text=m.group()

        if text.isupper():

            return word.upper()

        elif text.islower():

            return word.lower()

        elif text[0].isupper():

            return word.capitalize()

        else:

            return word

    return replace

print(re.sub('python',matchcase('snake'),text3,flags=re.IGNORECASE))

# UPPER SNAKE, lower snake, Mixed Snake

3. re.compile()

* 同理，多次替换时可先进行编译

# 同样可以先编译、可以捕获分组

text2='Today is 11/27/2012. PyCon starts 3/13/2013.'

datepat=re.compile(r'(\d+)/(\d+)/(\d+)')

print(datepat.sub(r'\3-\1-\2',text2))

# Today is 2012-11-27. PyCon starts 2013-3-13.

4. re.subn()

* 获取替换的次数

# re.subn()可以统计替换发生次数

newtext,n=datepat.subn(r'\3-\1-\2',text2)

print(newtext)

# Today is 2012-11-27. PyCon starts 2013-3-13.

print(n)

# 2

[PY3]——字符串的分割、匹配、搜索方法总结的更多相关文章

iOS 模糊、精确搜索匹配功能方法总结 By HL
字符串搜索主要用于UITableView的搜索功能的筛选,过滤,查询下面是一些流行的搜索查询方法一.遍历搜索 for循环根据要求:精确搜索(判读字符串相等) 模糊搜索(字符串包含) 相关知识 ...
js使用split函数按照多个字符对字符串进行分割的方法
这篇文章主要介绍了js使用split函数按照多个字符对字符串进行分割的方法,实例分析了split函数的使用技巧,非常具有实用价值,需要的朋友可以参考下本文实例讲述了js使用split函数按照多个 ...
{转}Java 字符串分割三种方法
http://www.chenwg.com/java/java-%E5%AD%97%E7%AC%A6%E4%B8%B2%E5%88%86%E5%89%B2%E4%B8%89%E7%A7%8D%E6%9 ...
js字符串与正则匹配
这里就说一下具体的使用方法,不做过多的解释. 字符串匹配正则的方法:str.方法(reg) 1.str.search() 参数是正则,将会从开始查找字符串中与正则匹配的字符,并返回该字符的第一次出现的 ...
oracle中如何对字符串进行去除空格的方法
oracle中如何对字符串进行去除空格的方法今天学习了一下oracle中如何对字符串进行去除空格的方法,这里总结一下.了解到的方法主要有两种:Trim函数以及Replace函数.下面我详细的介绍一下 ...
用Regex类计算一个字符串出现次数是最好方法【转载】
我的一个朋友问我,怎么在c#或vb.net中,计算一个字符串中查找另一个字符串中出现的次数,他说在网上打了好多方法,我看了一下,有的是用replace的方法去实现,这种方法不是太好,占资源太大了.其实 ...
JavaScript 字符串函数之查找字符方法(一)
1.JavaScript查找字符方法首先看一张有关字符串函数的查找字符的方法图接下里补充里面的方法说明 2.charAt() charAt() 方法返回字符串中指定位置的字符语法 ...
grep精确匹配搜索某个单词的用法（附: grep高效用法小结)）
grep(global search regular expression(RE) and print out the line,全面搜索正则表达式并把行打印出来)是一种强大的文本搜索工具,它能使用正 ...
LeetCode10 Hard，带你实现字符串的正则匹配
本文始发于个人公众号:TechFlow 这是LeetCode的第10题,题目关于字符串的正则匹配,我们先来看题目相关信息: Link Regular Expression Matching Diffi ...

随机推荐

Django博客项目思路整理
首先明确一点,我目前学习Django是为了做一个博客,那么以博客为目标进行实践的话,按照Django的MTV模型的顺序来思考的话,要考虑如下几个事情: (Models) 1.在博客里的各种数据模型: ...
Visual Studio 2017 如何监控当前变量占用内存空间大小
在进行VS调试时大家是否想知道当前变量占用了内存多少空间呢这对系统调优还是很有帮助的吧
sharepoint 2013 升级要求
1. 安装过程合理: A. 可以同时在管理中心.两台前端.搜索服务器上安装重新发布的SP1补丁包(所提供的链接) B. 等待所有SP1补丁包安装完成,依次在管理中心.两台前端.搜索服务器上运行配置向导 ...
201621123023《Java程序设计》第13周学习总结
一.本周学习总结以你喜欢的方式(思维导图.OneNote或其他)归纳总结多网络相关内容. 二.为你的系统增加网络功能(购物车.图书馆管理.斗地主等)-分组完成为了让你的系统可以被多个用户通过网络同 ...
多线程并行计算数据总和 —— 优化计算思想（多线程去计算）—— C语言demo
多线程计算整型数组数据总和: #include <stdio.h> #include <stdlib.h> #include <Windows.h> #includ ...
css3旋转动画
<!doctype html> <html> <head> <meta charset="utf-8"> <title> ...
LOJ#2039. 「SHOI2015」激光发生器（计算几何）
题面传送门题解如果我初中科学老师知道我有一天计算的时候入射角不等于反射角不知道会不会把我抓起来打一顿-- 这题本质上就是个模拟,需要的芝士也就计蒜几盒的那点,不过注意细节很多,放到考场上只能看看 ...
GoLand 调试 Go
Goland 调试 Go 从百度得知 VS Code 不能很好的支持 Go 的调试真让人肝儿疼 -- 引言准备 Win 10 Pro Go(Version 1.10) GoLand(2018.3) ...
Hangfire JobStorage.Current property value has not been initialized
app.UseHangfireServer() 放到 app.UseAbp() 前面作者: zhaok 出处: http://dotnetmonkey.cnblogs.com/
jsp页面struts2标签展示clob类型的数据
直接从数据库中查出来的数据,是clob类型的在前端页面展示的时候是这样: 后来找到了一个方法,在action中添加一个方法,解析转换clob数据的方法 public String getClob(Cl ...

[PY3]——字符串的分割、匹配、搜索方法总结