re.match(pattern, string, flags=0)  尝试从字符串的起始位置匹配一个模式
re.search(pattern, string, flags=0)  扫描整个字符串并返回第一个成功的匹配
re.sub(pattern, repl, string, max=0)  替换字符串中的匹配项
 

>>> import re

>>> s='112.90.239.137 112.90.239.137 1526446118 [26/Nov/2015:00:00:47 +0800] 23 "GET /ag/coord/convert?_appName=jiakaobaodianxingui&_appUser=632e76c53b4f3c9ffe90b8c4c61bd5b0&_cityCode=330300&_cityName=%E6%B8%A9%E5%B7%9E&_device=iPhone&_firstTime=2015-10-28%2018%3A49%3A05&_gpsType=baidu&_idfa=D0DD23E5-B407-449B-B005-0B52C6C2CBF3&_idfv=85A23658-2DD4-490D-B8AE-767842401821&_imei=c09b0b9b9759e72eaf0fd6e3eb38e55113d74cdd&_j=1.0&_jail=false&_latitude=27.610026844605&_launch=45&_longitude=120.56419068644&_network=wifi&_openUuid=c09b0b9b9759e72eaf0fd6e3eb38e55113d74cdd&_pkgName=cn.mucang.ios.jiakaobaodianPromise&_platform=iphone&_product=%E9%A9%BE%E8%80%83%E5%AE%9D%E5%85%B8-%E9%A9%BE%E7%85%A7%E8%80%83%E8%AF%95&_productCategory=jiakaobaodian&_renyuan=mucang&_screenDip=2&_screenHeight=1136&_screenWidth=640&_system=iPhone%20OS&_systemVersion=9.0.2&_vendor=appstore&_version=5.9.0&from=0&to=4&x=120.5576965508963&y=27.61254659188421 HTTP/1.1" "api.map.baidu.com" 200 76 gzip:116pct. "-" "BAIDUID=C328D2934E2C6EDF8E185FAC44EB168D:FG=1" "jiakaobaodianPromise/5.9.0 (iPhone; iOS 9.0.2; Scale/2.00)" map apimap 16555290153476373216 10.46.234.22 "9904758605881922946"'
>>> res=re.compile(r"(.*) (.*) (.*) \[(.*)\] (.*) \"(.*)\" \"(.*)\" (.*) (.*) (.*) \"(.*)\" \"(.*)\" \"(.*)\" (.*) (.*) (.*) (.*) \"(.*)\"")
>>> res is None
False
>>> res.search(s).groups()
('112.90.239.137', '112.90.239.137', '1526446118', '26/Nov/2015:00:00:47 +0800', '23', 'GET /ag/coord/convert?_appName=jiakaobaodianxingui&_appUser=632e76c53b4f3c9ffe90b8c4c61bd5b0&_cityCode=330300&_cityName=%E6%B8%A9%E5%B7%9E&_device=iPhone&_firstTime=2015-10-28%2018%3A49%3A05&_gpsType=baidu&_idfa=D0DD23E5-B407-449B-B005-0B52C6C2CBF3&_idfv=85A23658-2DD4-490D-B8AE-767842401821&_imei=c09b0b9b9759e72eaf0fd6e3eb38e55113d74cdd&_j=1.0&_jail=false&_latitude=27.610026844605&_launch=45&_longitude=120.56419068644&_network=wifi&_openUuid=c09b0b9b9759e72eaf0fd6e3eb38e55113d74cdd&_pkgName=cn.mucang.ios.jiakaobaodianPromise&_platform=iphone&_product=%E9%A9%BE%E8%80%83%E5%AE%9D%E5%85%B8-%E9%A9%BE%E7%85%A7%E8%80%83%E8%AF%95&_productCategory=jiakaobaodian&_renyuan=mucang&_screenDip=2&_screenHeight=1136&_screenWidth=640&_system=iPhone%20OS&_systemVersion=9.0.2&_vendor=appstore&_version=5.9.0&from=0&to=4&x=120.5576965508963&y=27.61254659188421 HTTP/1.1', 'api.map.baidu.com', '200', '76', 'gzip:116pct.', '-', 'BAIDUID=C328D2934E2C6EDF8E185FAC44EB168D:FG=1', 'jiakaobaodianPromise/5.9.0 (iPhone; iOS 9.0.2; Scale/2.00)', 'map', 'apimap', '16555290153476373216', '10.46.234.22', '9904758605881922946’)
 
>>> re.sub('(<b>)|(</b>)', '', s)
grep:
  -v, --invert-match        select non-matching lines

  -i, --ignore-case         ignore case distinctions

  -f, --file=FILE           obtain PATTERN from FILE
  -w, --word-regexp         force PATTERN to match only whole words
  -o, --only-matching       show only the part of a line matching PATTERN
 
  -P, --perl-regexp         PATTERN is a Perl regular expression
  -n, --line-number         print line number with output lines
  -H, --with-filename       print the file name for each match
  -B, --before-context=NUM  print NUM lines of leading context
  -A, --after-context=NUM   print NUM lines of trailing context
  -C, --context=NUM         print NUM lines of output context
  -a, --text                equivalent to --binary-files=text
  -s, --no-messages         suppress error messages
 
regexp:
  • . (dot) - a single character.
  • ? - the preceding character matches 0 or 1 times only.
  • * - the preceding character matches 0 or more times.
  • + - the preceding character matches 1 or more times.
  • {n} - the preceding character matches exactly n times.
  • {n,m} - the preceding character matches at least n times and not more than m times.
  • [agd] - the character is one of those included within the square brackets.
  • [^agd] - the character is not one of those included within the square brackets.
  • [c-f] - the dash within the square brackets operates as a range. In this case it means either the letters c, d, e or f.
  • () - allows us to group several characters to behave as one.
  • | (pipe symbol) - the logical OR operation.
  • ^ - matches the beginning of the line.
  • $ - matches the end of the line.
  • \s - matches anything which is considered whitespace. This could be a space, tab, line break etc.
  • \S - matches the opposite of \s, that is anything which is not considered whitespace.
  • \d - matches anything which is considered a digit. ie 0 - 9 (It is effectively a shortcut for [0-9]).
  • \D - matches the opposite of \d, that is anything which is not considered a digit.
  • \w - matches anything which is considered a word character. That is [A-Za-z0-9_]. Note the inclusion of the underscore character '_'. This is because in programming and other areas we regulaly use the underscore as part of, say, a variable or function name.
  • \W - matches the opposite of \w, that is anything which is not considered a word character.
  • Tab - represented in regular expressions as \t
  • Carriage return - represented in regular expressions as \r
  • Line feed (or newline) - represented in regular expressions as \n
  • Windows - uses the sequence \r\n (in that order)
  • Mac OS (version 9 and below) - uses the sequence \r
  • Unix/Linux and OSX - uses the sequence \n
  • \< - represents the beginning of a word.
  • \> - represents the end of a word.
  • \b - represents either the beginning or end of a word.
  • ( )Group part of the regular expression.\1 \2 etcRefer to something matched by a previous grouping.|Match what is on either the left or right of the pipe symbol.(?=x)Positive lookahead.(?!x)Negative lookahead.(?<=x)Positive lookbehind.(?<!x)Negative lookbehind.

regular expression, grep (python, linux)的更多相关文章

  1. [leetcode]Regular Expression Matching @ Python

    原题地址:https://oj.leetcode.com/problems/regular-expression-matching/ 题意: Implement regular expression ...

  2. grep(Global Regular Expression Print)

    .grep -iwr --color 'hellp' /home/weblogic/demo 或者 grep -iw --color 'hellp' /home/weblogic/demo/* (-i ...

  3. Python中的正则表达式regular expression

    1 match = re.search(pat,str)  If the search is successful, search() returns a match object or None o ...

  4. Python正则表达式Regular Expression基本用法

    资料来源:http://blog.csdn.net/whycadi/article/details/2011046   直接从网上资料转载过来,作为自己的参考.这个写的很清楚.先拿来看看. 1.正则表 ...

  5. python(4): regular expression正则表达式/re库/爬虫基础

    python 获取网络数据也很方便 抓取 requests 第三方库适合做中小型网络爬虫的开发, 大型的爬虫需要用到 scrapy 框架 解析 BeautifulSoup 库, re 模块 (一) r ...

  6. 正则表达式-使用说明Regular Expression How To (Perl, Python, etc)

    notepad++ wiki about regular expression 正则表达式-使用说明Regular Expression How To (Perl, Python, etc) http ...

  7. Python 模块 re (Regular Expression)

    使用 Python 模块 re 实现解析小工具   概要 在开发过程中发现,Python 模块 re(Regular Expression)是一个很有价值并且非常强大的文本解析工具,因而想要分享一下此 ...

  8. python learning Regular Expression.py

    # 正则表达式,又称规则表达式.(英语:Regular Expression,在代码中常简写为regex.regexp或RE),计算机科学的一个概念.正则表达式通常被用来检索.替换那些符合某个模式(规 ...

  9. Python -- 正则表达式 regular expression

    正则表达式(regular expression) 根据其英文翻译,re模块 作用:用来匹配字符串.  在Python中,正则表达式是特殊的字符序列,检查一个字符串是否与某种模式匹配. 设计思想:用一 ...

随机推荐

  1. java Iterator类

    查看java源码. /* * Copyright (c) 1997, 2013, Oracle and/or its affiliates. All rights reserved. * ORACLE ...

  2. 《Head First Servlets & JSP》-7-使用JSP

    学习的知识点 JSP,最后会变成一个servlet JSP最终或变成一个完整的servlet在Web应用中运行,只不过这个servlet类会由容器写好. JSP中的scriptlet 所谓script ...

  3. 在 Java 的反射中,Class.forName 和 ClassLoader 的区别

    1. 解释 在java中Class.forName()和ClassLoader都可以对类进行加载.ClassLoader就是遵循双亲委派模型最终调用启动类加载器的类加载器,实现的功能是“通过一个类的全 ...

  4. lua遍历文件

    看了不少人的,主要还是错误处理有点问题,不多说了 贴代码: require "lfs" function getpathes(rootpath, pathes) pathes = ...

  5. Grideview总结

    http://www.cnblogs.com/sufei/archive/2010/03/27/1698590.html

  6. Web Server 在iis下部署php网站在iis下

    Web Server  在iis下部署php网站在iis下 一.参考地址: windows8 http://www.cnblogs.com/haocool/archive/2012/10/14/win ...

  7. 【LeetCode每天一题】Remove Duplicates from Sorted List(移除有序链表中的重复数字)

    Given a sorted linked list, delete all duplicates such that each element appear only once. Example 1 ...

  8. gRPC官方文档(概览)

    文章来自gRPC 官方文档中文版 概览 开始 欢迎进入 gRPC 的开发文档,gRPC 一开始由 google 开发,是一款语言中立.平台中立.开源的远程过程调用(RPC)系统. 本文档通过快速概述和 ...

  9. ADX3000二层的负载均衡设计问题

    我的想法是 想在现有的局域网内部,利用ADX划分出一个新的局域网,模拟负载均衡. 现在有三台试验机器,拓扑图如下: 各个机器IP设置如下图: 我进行了如下的操作: 1 在组网配置当中,设置eth1_0 ...

  10. curl抓取网页内容php

    1.cURL  curl是客户端向服务器请求资源的工具 2.cURL使用场景 网页资源:网页爬虫 webservice数据接口资源:动态获取接口数据 天气 号码归属地 ftp资源:下载ftp服务器里面 ...