[Python Study Notes]字符串处理技巧(持续更新)

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

>>文件: 字符串处理.py

>>作者: liu yang

>>邮箱: liuyang0001@outlook.com

>>博客: www.cnblogs.com/liu66blog

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import sys, os

# 1.字符串的连接和合并

# 相加 //两个字符串可以很方便的通过'+'连接起来

str1='123'

str2='456'

str3=str1+str2

print(str3)

# -----------输出----------------------

# 123456

# ------------------------------------

# 合并//用join方法

url=['www','cnblog','com/liu66blog']

print('.'.join(url))

# -----------输出----------------------

# www.cnblog.com/liu66blog

# ------------------------------------

# 2.字符串的切片和相乘

# 相乘//比如写代码的时候要分隔符，用python很容易实现

Separator='*'*30

print(Separator)

# -----------输出----------------------

# ******************************

# ------------------------------------

# 切片操作

url='www.cnblogs.com/liu66blog'

# 取下标0-15个字符

print(url[0:16])

# 取下标16-最后一个

print(url[16:])

# 取倒数第四个到最后

print(url[-4:])

# 复制字符串

print(url[::])

# -----------输出----------------------

# www.cnblogs.com/

# liu66blog

# blog

# www.cnblogs.com/liu66blog

# ------------------------------------

# 3.字符串的分割

# 普通的分割，用split

# split只能做非常简单的分割，而且不支持多个分隔

url='www.cnblogs.com/liu66blog'

url_list=url.split('.')

print(url_list)

# -----------输出----------------------

# ['www', 'cnblogs', 'com/liu66blog']

# ------------------------------------

# 复杂的分割

# r表示不转义,分隔符可以是;或者,或者/,或者空格后面跟0个多个额外的空格，然后按照这个模式去分割

url='www.cnblogs.com/liu66blog'

import re

url_list=re.split(r'[.;/]\s*',url)

print(url_list)

# -----------输出----------------------

# ['www', 'cnblogs', 'com', 'liu66blog']

# ------------------------------------

# 4.字符串的开头和结尾的处理

#  比方我们要查一个名字是以什么开头或者什么结尾

url='www.cnblogs.com/liu66blog'

result=url.endswith('blog')

print(result)

result=url.startswith('ww.')

print(result)

# -----------输出----------------------

# True

# False

# ------------------------------------

# 5.字符串的查找和匹配

# 一般查找

# 我们可以很方便的在长的字符串里面查找子字符串，会返回子字符串所在位置的索引, 若找不到返回-1

url='www.cnblogs.com/liu66blog'

result=url.find('liu66')

print(result)

result=url.find('liuyang')

print(result)

# -----------输出----------------------

# 16

# -1

# ------------------------------------

# 复杂查找

data_str='2018/2/22'

result=re.match(r'\d+/\d+/\d+',data_str)

if result:

    print('ok,存在')

# -----------输出----------------------

# ok,存在

# ------------------------------------

# 6.字符串的替换

# 普通的替换//用replace就可以

url='www.cnblogs.com/liu66blog'

url_new=url.replace('www.','')

print(url_new)

# -----------输出----------------------

# cnblogs.com/liu66blog

# ------------------------------------

# 复杂的替换 利用re.sub函数

url='www.cnblogs.com/liu66blog'

url_new=re.sub(r'\d\d','00',url)

print(url_new)

# -----------输出----------------------

# cnblogs.com/liu00blog

# ------------------------------------

# 7.字符串中去掉一些字符

# 去除空格//对文本处理的时候比如从文件中读取一行，然后需要去除每一行的两侧的空格，table或者是换行符

url='  www.cnblogs.com/liu66blog  '

url_new=url.strip()

print(url_new)

# 复杂的文本清理,可以利用str.translate，

# 先构建一个转换表，table是一个翻译表，表示把'w'转成大写的'W',

# 然后在old_str里面去掉'liu66',然后剩下的字符串再经过table 翻译

# Python3.4已经没有string.maketrans()了，取而代之的是内建函数:

# bytearray.maketrans()、bytes.maketrans()、str.maketrans()

url='www.cnblogs.com/liu66blog'

# 创建翻译表

instr='w'

outstr='W'

table=str.maketrans(instr,outstr)

url_new=url.translate(table)

print(url_new)

# -----------输出----------------------

# WWW.cnblogs.com/liu66blog

# ------------------------------------

# 8.找最长的单词

txt='Python is a programming language that lets you work more quickly and integrate your systems more effectively. ' \

    'You can learn to use Python and see almost immediate gains in productivity and lower maintenance costs. ' \

    'Learn more about Python..'

# 使用空格分隔

txt_list=txt.split(' ')

# 使用sorted()函数按照单词长度排序

txt_list_new=sorted(txt_list,key=lambda x:len(x),reverse=True)

# 定义一个空列表,存储最长的

longest_word=[]

# 判断后面的单词长度

for i,word in enumerate(txt_list_new):

    if len(txt_list_new[i])<len(txt_list_new[0]):

        break

    else:

        longest_word.append(txt_list_new[i])

print(longest_word)

# -----------输出----------------------

# ['effectively.', 'productivity']

# ------------------------------------

# 9.找出指定长度的单词

len_4_word=filter(lambda x:5>len(x)>=4,txt_list)

# 注意python3 filter返回不再是列表 需要自己转换!!

len_4_word_list=list(len_4_word)

# 转换成去重元祖

len_4_word_tuple=tuple(set(len_4_word_list))

print(len_4_word_list)

print(len_4_word_tuple)

# -----------输出----------------------

# ['that', 'lets', 'work', 'more', 'your', 'more', 'more']

# ('your', 'more', 'lets', 'that', 'work')

# ------------------------------------

# 10.使用最频繁的单词

from collections import Counter

# most_common(x) x代表列举的个数

print(Counter(txt_list).most_common(6))

# -----------输出----------------------

# [('more', 3), ('and', 3), ('Python', 2), ('is', 1), ('a', 1), ('programming', 1)]

# ------------------------------------

# 11.列出所有大写的单词

title_words_list=[]

for i in txt_list:

    if i.istitle():

        title_words_list.append(i)

# 得到去重字典

title_words_dict=set(title_words_list)

print(title_words_list)

print(title_words_dict)

# -----------输出----------------------

# ['Python', 'You', 'Python', 'Learn', 'Python..']

# {'Python..', 'Learn', 'Python', 'You'}

# ------------------------------------

# 12.未完待续...

[Python Study Notes]字符串处理技巧(持续更新)的更多相关文章

[Python Study Notes]字符串操作
字符串操作 a.字符串格式化输出 name = "liu" print "i am %s " % name #输出: i am liu PS: 字符 ...
fastadmin 后台管理框架使用技巧(持续更新中)
fastadmin 后台管理框架使用技巧(持续更新中) FastAdmin是一款基于ThinkPHP5+Bootstrap的极速后台开发框架,具体介绍,请查看文档,文档地址为:https://doc. ...
PLSQL Developer 11 使用技巧(持续更新)
PLSQL Developer 11 使用技巧 (持续更新) 目录(?)[-] 首先是我的颜色配置常用快捷键提升PLSQL编程效率按空格自动替换关闭Window窗口 PLSQL 实用技巧 TI ...
【Python】【学习笔记】持续更新
调用模块的两种方式: #方式1 from decimal import Decimal Decimal('1.00') #方式2 import decimal decimal.Decimal('1.0 ...
Python：常见错误集锦（持续更新ing）
初学Python,很容易与各种错误不断的遭遇.通过集锦,可以快速的找到错误的原因和解决方法. 1.IndentationError:expected an indented block 说明此处需要缩 ...
[Python Study Notes]CS架构远程访问获取信息--Client端v2.0
更新内容: 1.增加内存信息获取 2.增加电池信息获取 3.增加磁盘信息获取 4.重新布局窗体 5.增加窗体名称 6.增加连接成功之前,不可按压效果图: '''''''''''''''''''''' ...
[Python Study Notes]CS架构远程访问获取信息--Client端v1.0
更新内容: 1.添加entry栏默认ip和port口 2.修正退出功能 3.添加退出自动关闭窗口功能 4.优化cpu显示为固定保留两位小数 '''''''''''''''''''''''''''''' ...
The Python Challenge 谜题全解（持续更新）
Python Challenge(0-2) The Python Challengehttp://www.pythonchallenge.com/ 是个很有意思的网站,可以磨练使用python的技巧, ...
个人在 laravel 开发中使用到的一些技巧(持续更新)
1.更高效率地查询:使用批量查询代替 foreach 查询(多次 io 操作转换为一次 io操作) 如果想要查看更详尽的介绍,可以看看这篇文章什么是 N+1 问题,以及如何解决 Laravel 的 ...

随机推荐

hbase伪分布式安装（单节点安装）
hbase伪分布式安装(单节点安装) http://hbase.apache.org/book.html#quickstart 1. 前提配置好java,环境java变量上传jdk ...
Tomcat服务器的下载及安装
Tomcat服务器的下载及安装 1)到apache官网.www.apache.org http://jakarta.apache.org(产品的主页) 2) 安装版:window (exe.m ...
HEXO+Github,搭建属于自己的博客
摘录自:http://www.jianshu.com/p/465830080ea9 1. github的准备账号密码建立Repository建立与你用户名对应的仓库,仓库名必须为[your_us ...
python基础6之迭代器&生成器、json&pickle数据序列化
内容概要: 一.生成器二.迭代器三.json&pickle数据序列化一.生成器generator 在学习生成器之前我们先了解下列表生成式,现在生产一个这样的列表[0,2,4,6,8,10 ...
jquery checkbox 全选反选代码只能执行一遍，第二次就失败
遇到问题背景: 在写到购物车的全选交互的时候,商品选中的状态只有在第一次的时候可以,第二次就无法选中:(代码如下) $(".chooseall").click(function() ...
安卓开发-intent在Activity之间数据传递
安卓开发-intent在Activity之间数据传递 [TOC] intent实现普通跳转使用intent的setclass方法,示例(由此界面跳转到NewActivity界面) //使用setOn ...
js == 和 ===
1.对于string,number等基础类型,==和===是有区别的 1)不同类型间比较,==之比较"转化成同一类型后的值"看"值"是否相等,===如果类型不同 ...
python3 第十二章 - 数据类型之List（列表）
Python内置的一种数据类型是列表:list. list是一种有序的集合可以随时添加和删除其中的元素. 它可以作为一个方括号内的逗号分隔值出现. 列表的数据项不需要具有相同的类型创建一个列表,只 ...
python_5_模块
创:5_4_2017 修: 什么是模块? --标准库+第三方库+自定义,为实现某一方面的功能集合(变量,函数,类) 如何安装第三方库? --pip install 第三方库如何导入和使用模块? -- ...
【javaweb学习笔记】WEB02_HTML&CSS
一.表单相关知识 1.表单: 所有需要提交到服务器端的表单项必须使用<form></form>括起来 form标签属性(有两个): 1)action,整个表单提交的位置(可以是 ...

[Python Study Notes]字符串处理技巧(持续更新)

[Python Study Notes]字符串处理技巧(持续更新)的更多相关文章

随机推荐

热门专题