Shell编程之文本处理
cut 截取自定列
可以按照某个字符进行分割,然后取出其中的指定列:
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat localhost_access_log.--.txt
140.205.201.30 - - [/Dec/::: +] "GET / HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /rs-status HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /ganglia/index.php HTTP/1.1" -
164.132.91.1 - - [/Dec/::: +] "GET / HTTP/1.1" -
114.215.45.101 - - [/Dec/::: +] "GET / HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /index.php HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /jobs/ HTTP/1.1" -
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat localhost_access_log.--.txt |cut -d ' ' -f
"GET
"GET
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"GET
"GET
"GET
"GET
可以指定更多的列:
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat localhost_access_log.--.txt |cut -d ' ' -f ,,
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat localhost_access_log.--.txt |cut -d ' ' -f ,,-
- - "GET / HTTP/1.1" -
- - "GET /rs-status HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /ganglia/index.php HTTP/1.1" -
- - "GET / HTTP/1.1" -
- - "GET / HTTP/1.1" -
- - "GET /index.php HTTP/1.1" -
- - "GET /jobs/ HTTP/1.1" -
sort 对列进行排序
例如,对tomcat访问日志,对请求响应返回大小进行排序:
cat localhost_access_log.--.txt |sort -t ' ' -k
-t : 指定分隔符
-k : 指定排序的列
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
排序是由方向的,默认是升序排序,如果要降序排列,可以在列号后面增加一个r:
cat localhost_access_log.--.txt |sort -t ' ' -k 10r
最后要注意的是,这里的排序默认是按字符串的字典顺序排列的,如果要按其数值拍,则需要增加一个n:
cat localhost_access_log.--.txt |sort -t ' ' -k 10n
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
112.65.193.14 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
由此可见,此网站最大的静态资源是这个jquery-ui.min.js文件。
uniq去重
cat localhost_access_log.--.txt |cut -d ' ' -f , |sort -t ' ' -k 2n,|uniq
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
112.65.193.14
114.241.108.197
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
112.65.193.14
114.241.108.197
223.72.82.98
59.108.217.106
wc统计
[root@iZ25klm6k7uZ logs]# wc -l localhost_access_log.--.txt 统计行数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]# wc -w localhost_access_log.--.txt 统计词数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]# wc -m localhost_access_log.--.txt 共计字符数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]#
sed正则查找
用sed来查找500的日志信息:
[root@iZ25klm6k7uZ logs]# sed -n '/\b500\b/p' localhost_access_log.--.txt
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
注意:-n和-p配合,表示只打印匹配的行。
awk正则匹配
用awk来查找500日志信息:
awk '($9 ~ /500/)' localhost_access_log.--.txt
输出和上面的sed一样。
zwk有默认的分隔符,比如\t,空格等。如果要指定分隔符可以用-F。
zwk的强大之处在于它支持编程,格式如下:
awk pattern { action } 例如上面的查找500日志可以完整表达如下:
[root@iZ25klm6k7uZ logs]# awk -F ' ' '($9 ~ /500/){print }' localhost_access_log.--.txt
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
同时查找500和404的日志:
awk -F ' ' '($9 ~ /500/ || $9 ~ /404/){print $1,$6,$7,$9}' localhost_access_log.--.txt
或者
awk -F ' ' '($9 ~ /500|404|400/){print $1,"-",$4,"-",$6,"-",$9}' localhost_access_log.--.txt
Shell编程之文本处理的更多相关文章
- linux —— shell 编程(文本处理)
导读 本文为博文linux —— shell 编程(整体框架与基础笔记)的第4小点的拓展.(本文所有语句的测试均在 Ubuntu 16.04 LTS 上进行) 目录 基本文本处理 流编辑器sed aw ...
- shell编程之文本与日志过滤
1:grep命令: grep -v "char" file_name 匹配不包括"char"的文本 grep -n -w "char" ...
- shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中
shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中 利用shell脚本将文本数据导入到mysql中 需求1:处理文本中的数据,将文本中的数据插入到mys ...
- shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容
shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容 删除命令对照表 命令 含义 1d 删除第一行内容 ,10d 删除1行到10行的内容 ,+5d 删除10行到16行的内容 /p ...
- Linux学习笔记(17) Shell编程之基础
1. 正则表达式 (1) 正则表达式用来在文件中匹配符合条件的字符串,正则是包含匹配.grep.awk.sed等命令可以支持正则表达式:通配符用来匹配符合条件的文件名,通配符是完全匹配.ls.find ...
- Linux Shell编程入门
从程序员的角度来看, Shell本身是一种用C语言编写的程序,从用户的角度来看,Shell是用户与Linux操作系统沟通的桥梁.用户既可以输入命令执行,又可以利用 Shell脚本编程,完成更加复杂的操 ...
- Shell编程菜鸟基础入门笔记
Shell编程基础入门 1.shell格式:例 shell脚本开发习惯 1.指定解释器 #!/bin/bash 2.脚本开头加版权等信息如:#DATE:时间,#author(作者)#mail: ...
- ****CodeIgniter使用cli模式运行,把php作为shell编程
shell简介 在计算机科学中,Shell俗称壳(用来区别于核).而我们常说的shell简单理解就是一个命令行界面,它使得用户能与操作系统的内核进行交互操作. 常见的shell环境有:MS-DOS.B ...
- Linux Shell编程基础
在学习Linux BASH Shell编程的过程中,发现由于不经常用,所以很多东西很容易忘记,所以写篇文章来记录一下 ls 显示当前路径下的文件,常用的有 -l 显示长格式 -a 显示所有包括隐 ...
随机推荐
- 一:Spring Boot、Spring Cloud
上次写了一篇文章叫Spring Cloud在国内中小型公司能用起来吗?介绍了Spring Cloud是否能在中小公司使用起来,这篇文章是它的姊妹篇.其实我们在这条路上已经走了一年多,从16年初到现在. ...
- C#.NET 用程序画图,曲线图
using System;using System.Data;using System.Configuration;using System.Web;using System.Web.Security ...
- WPF字符串中的换行符
<sys:String x:Key="NewUpdateWillShow" xml:space="preserve">第一行 第二行 </sy ...
- Android内存泄露的原因
(一)释放对象的引用,误将一个本来生命周期短的对象存放到一个生命周期相对较长的对象中,也称“对象游离“.隐蔽的内部类(Anonymous Inner Class): mHandler = new Ha ...
- canvas三环加载进度条
之前做了一个三个圆形叠加在一起的加载,用的是定位和cile来操作,但是加载的头部不能是圆形.后来用canvas做了一个,但是这个加载的进度不好调整,原理很简单,就是让一个圆,按照圆形轨迹进行运动就可以 ...
- [动态规划]P1220 关路灯
题目描述 某一村庄在一条路线上安装了n盏路灯,每盏灯的功率有大有小(即同一段时间内消耗的电量有多有少).老张就住在这条路中间某一路灯旁,他有一项工作就是每天早上天亮时一盏一盏地关掉这些路灯. 为了给村 ...
- iis7 部署网站 403错误
C:\Windows\Microsoft.NET\Framework\v4.0.30319\aspnet_regiis.exe -i 403 - 禁止访问: 访问被拒绝. 您无权使用所提供的凭据查看此 ...
- ASP.NET中HttpContext.Cache的使用
-------------------------------键 --值-----依赖-----过期时间-------------------------------绝对过期------------- ...
- 常用接口简析3---IList和List的解析
常用接口的解析(链接) 1.IEnumerable深入解析 2.IEnumerable.IEnumerator接口解析 3.IComparable.IComparable接口解析 学习第一步,先上菜: ...
- Python中的列表生成器,迭代器的理解
首先,思考一个问题,比如,我们想生成0-100的列表,我们怎么做? 当然,可以写成 list1=[1,2,3...,100] 可以看出,这种方法不适合生成长的列表,那么Python中就可以利用已有的列 ...