Shell编程之文本处理
cut 截取自定列
可以按照某个字符进行分割,然后取出其中的指定列:
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat localhost_access_log.--.txt
140.205.201.30 - - [/Dec/::: +] "GET / HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /rs-status HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /ganglia/index.php HTTP/1.1" -
164.132.91.1 - - [/Dec/::: +] "GET / HTTP/1.1" -
114.215.45.101 - - [/Dec/::: +] "GET / HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /index.php HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /jobs/ HTTP/1.1" -
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat localhost_access_log.--.txt |cut -d ' ' -f
"GET
"GET
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"GET
"GET
"GET
"GET
可以指定更多的列:
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat localhost_access_log.--.txt |cut -d ' ' -f ,,
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat localhost_access_log.--.txt |cut -d ' ' -f ,,-
- - "GET / HTTP/1.1" -
- - "GET /rs-status HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /ganglia/index.php HTTP/1.1" -
- - "GET / HTTP/1.1" -
- - "GET / HTTP/1.1" -
- - "GET /index.php HTTP/1.1" -
- - "GET /jobs/ HTTP/1.1" -
sort 对列进行排序
例如,对tomcat访问日志,对请求响应返回大小进行排序:
cat localhost_access_log.--.txt |sort -t ' ' -k
-t : 指定分隔符
-k : 指定排序的列
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
排序是由方向的,默认是升序排序,如果要降序排列,可以在列号后面增加一个r:
cat localhost_access_log.--.txt |sort -t ' ' -k 10r
最后要注意的是,这里的排序默认是按字符串的字典顺序排列的,如果要按其数值拍,则需要增加一个n:
cat localhost_access_log.--.txt |sort -t ' ' -k 10n
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
112.65.193.14 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
由此可见,此网站最大的静态资源是这个jquery-ui.min.js文件。
uniq去重
cat localhost_access_log.--.txt |cut -d ' ' -f , |sort -t ' ' -k 2n,|uniq
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
112.65.193.14
114.241.108.197
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
112.65.193.14
114.241.108.197
223.72.82.98
59.108.217.106
wc统计
[root@iZ25klm6k7uZ logs]# wc -l localhost_access_log.--.txt 统计行数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]# wc -w localhost_access_log.--.txt 统计词数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]# wc -m localhost_access_log.--.txt 共计字符数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]#
sed正则查找
用sed来查找500的日志信息:
[root@iZ25klm6k7uZ logs]# sed -n '/\b500\b/p' localhost_access_log.--.txt
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
注意:-n和-p配合,表示只打印匹配的行。
awk正则匹配
用awk来查找500日志信息:
awk '($9 ~ /500/)' localhost_access_log.--.txt
输出和上面的sed一样。
zwk有默认的分隔符,比如\t,空格等。如果要指定分隔符可以用-F。
zwk的强大之处在于它支持编程,格式如下:
awk pattern { action } 例如上面的查找500日志可以完整表达如下:
[root@iZ25klm6k7uZ logs]# awk -F ' ' '($9 ~ /500/){print }' localhost_access_log.--.txt
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
同时查找500和404的日志:
awk -F ' ' '($9 ~ /500/ || $9 ~ /404/){print $1,$6,$7,$9}' localhost_access_log.--.txt
或者
awk -F ' ' '($9 ~ /500|404|400/){print $1,"-",$4,"-",$6,"-",$9}' localhost_access_log.--.txt
Shell编程之文本处理的更多相关文章
- linux —— shell 编程(文本处理)
导读 本文为博文linux —— shell 编程(整体框架与基础笔记)的第4小点的拓展.(本文所有语句的测试均在 Ubuntu 16.04 LTS 上进行) 目录 基本文本处理 流编辑器sed aw ...
- shell编程之文本与日志过滤
1:grep命令: grep -v "char" file_name 匹配不包括"char"的文本 grep -n -w "char" ...
- shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中
shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中 利用shell脚本将文本数据导入到mysql中 需求1:处理文本中的数据,将文本中的数据插入到mys ...
- shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容
shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容 删除命令对照表 命令 含义 1d 删除第一行内容 ,10d 删除1行到10行的内容 ,+5d 删除10行到16行的内容 /p ...
- Linux学习笔记(17) Shell编程之基础
1. 正则表达式 (1) 正则表达式用来在文件中匹配符合条件的字符串,正则是包含匹配.grep.awk.sed等命令可以支持正则表达式:通配符用来匹配符合条件的文件名,通配符是完全匹配.ls.find ...
- Linux Shell编程入门
从程序员的角度来看, Shell本身是一种用C语言编写的程序,从用户的角度来看,Shell是用户与Linux操作系统沟通的桥梁.用户既可以输入命令执行,又可以利用 Shell脚本编程,完成更加复杂的操 ...
- Shell编程菜鸟基础入门笔记
Shell编程基础入门 1.shell格式:例 shell脚本开发习惯 1.指定解释器 #!/bin/bash 2.脚本开头加版权等信息如:#DATE:时间,#author(作者)#mail: ...
- ****CodeIgniter使用cli模式运行,把php作为shell编程
shell简介 在计算机科学中,Shell俗称壳(用来区别于核).而我们常说的shell简单理解就是一个命令行界面,它使得用户能与操作系统的内核进行交互操作. 常见的shell环境有:MS-DOS.B ...
- Linux Shell编程基础
在学习Linux BASH Shell编程的过程中,发现由于不经常用,所以很多东西很容易忘记,所以写篇文章来记录一下 ls 显示当前路径下的文件,常用的有 -l 显示长格式 -a 显示所有包括隐 ...
随机推荐
- Linux服务器上安装vsftpd
1.首先判断你服务器上是否安装了vsftpd rpm -q vsftpd 2.安装vsftpd yum -y install vsftpd 3.重启vsftpd service vsftpd ...
- riot.js教程【四】Mixins、HTML内嵌表达式
前文回顾 riot.js教程[三]访问DOM元素.使用jquery.mount输入参数.riotjs标签的生命周期: riot.js教程[二]组件撰写准则.预处理器.标签样式和装配方法: riot.j ...
- 在Linux机器上安装telnet命令
一.查看本机是否安装 telnet #rpm -qa | grep telnet 如果什么都不显示,说明没有安装telnet 二.开始安装 yum install xinetd y ...
- UWP 返回顶部按钮
返回顶部的按钮,like this 当用户下滑了一定的距离之后,通常是快滑倒底部的时候,出现返回顶部按钮,当用户向上快滑,滑到顶部的时候,返回顶部按钮自动消失. 在UWP中,用来滚动内容的控件是Scr ...
- 基于Vue.js的大型报告页项目实现过程及问题总结(二)
距离上一篇文章过去了二十多天了,期间一直想把第二部分写完,结果在测试过程中遇到了各种坑爹的问题,到今天才算基本完成,也许还有后续,但趁着今天有时间就写出来吧,也算对这个项目的一个总结了 遇到最大问题: ...
- Java注解Annotation详解
从JDK5开始,Java增加了Annotation(注解),Annotation是代码里的特殊标记,这些标记可以在编译.类加载.运行时被读取,并执行相应的处理.通过使用Annotation,开发人员可 ...
- Did you forget about DBModel.InitializeModel the model [AAAdm] ?
AIO5安装完毕后登陆出现以下报错:Did you forget about DBModel.InitializeModel the model [AAAdm] ? 说明: 执行当前 Web 请求期间 ...
- Unity20172.0 Android平台打包
Android SDK及Jdk百度网盘下载链接:https://pan.baidu.com/s/1dFbEmdz 密码:pt7b Unity20172.0 Android平台打包 简介说明: 第一步: ...
- Java面向对象编程基础
一.Java面向对象编程基础 1.什么是对象?Object 什么都是对象! 只要是客观存在的具体事物,都是对象(汽车.小强.事件.任务.按钮.字体) 2.为什么需要面向对象? 面向对象能够像分析现实生 ...
- [Docker网络]模拟一台交换机的拓扑
[Docker网络]模拟一台交换机的拓扑 本例主要对Docker网络进行实际运用. 背景介绍 一台虚拟机如何模拟成一台多端口交换机分别连接多台虚拟机? bridge网桥技术 实验准备 docker d ...