cut 截取自定列

可以按照某个字符进行分割,然后取出其中的指定列:

[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat  localhost_access_log.--.txt
140.205.201.30 - - [/Dec/::: +] "GET / HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /rs-status HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /ganglia/index.php HTTP/1.1" -
164.132.91.1 - - [/Dec/::: +] "GET / HTTP/1.1" -
114.215.45.101 - - [/Dec/::: +] "GET / HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /index.php HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /jobs/ HTTP/1.1" -
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat  localhost_access_log.--.txt |cut -d ' ' -f
"GET
"GET
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"GET
"GET
"GET
"GET

可以指定更多的列:

[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat  localhost_access_log.--.txt |cut -d ' ' -f ,,
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat  localhost_access_log.--.txt |cut -d ' ' -f ,,-
- - "GET / HTTP/1.1" -
- - "GET /rs-status HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /ganglia/index.php HTTP/1.1" -
- - "GET / HTTP/1.1" -
- - "GET / HTTP/1.1" -
- - "GET /index.php HTTP/1.1" -
- - "GET /jobs/ HTTP/1.1" -

sort 对列进行排序

例如,对tomcat访问日志,对请求响应返回大小进行排序:

cat localhost_access_log.--.txt |sort -t ' ' -k 

-t : 指定分隔符

-k : 指定排序的列

114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"

排序是由方向的,默认是升序排序,如果要降序排列,可以在列号后面增加一个r:

cat localhost_access_log.--.txt |sort -t ' ' -k 10r

最后要注意的是,这里的排序默认是按字符串的字典顺序排列的,如果要按其数值拍,则需要增加一个n:

 cat localhost_access_log.--.txt |sort -t ' ' -k 10n
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
112.65.193.14 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"

由此可见,此网站最大的静态资源是这个jquery-ui.min.js文件。

uniq去重

 cat localhost_access_log.--.txt |cut -d ' ' -f , |sort -t ' ' -k 2n,|uniq
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
112.65.193.14
114.241.108.197
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
112.65.193.14
114.241.108.197
223.72.82.98
59.108.217.106

wc统计

[root@iZ25klm6k7uZ logs]# wc -l localhost_access_log.--.txt  统计行数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]# wc -w localhost_access_log.--.txt 统计词数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]# wc -m localhost_access_log.--.txt 共计字符数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]#

sed正则查找

用sed来查找500的日志信息:

[root@iZ25klm6k7uZ logs]# sed -n '/\b500\b/p' localhost_access_log.--.txt
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"

注意:-n和-p配合,表示只打印匹配的行。

awk正则匹配

用awk来查找500日志信息:

awk '($9 ~ /500/)' localhost_access_log.--.txt 

输出和上面的sed一样。

zwk有默认的分隔符,比如\t,空格等。如果要指定分隔符可以用-F。

zwk的强大之处在于它支持编程,格式如下:

awk pattern { action } 例如上面的查找500日志可以完整表达如下:

[root@iZ25klm6k7uZ logs]# awk -F ' ' '($9 ~ /500/){print }' localhost_access_log.--.txt
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"

同时查找500和404的日志:

awk -F ' ' '($9 ~ /500/ || $9 ~ /404/){print $1,$6,$7,$9}' localhost_access_log.--.txt

或者

awk -F ' ' '($9 ~ /500|404|400/){print $1,"-",$4,"-",$6,"-",$9}' localhost_access_log.--.txt

Shell编程之文本处理的更多相关文章

  1. linux —— shell 编程(文本处理)

    导读 本文为博文linux —— shell 编程(整体框架与基础笔记)的第4小点的拓展.(本文所有语句的测试均在 Ubuntu 16.04 LTS 上进行) 目录 基本文本处理 流编辑器sed aw ...

  2. shell编程之文本与日志过滤

    1:grep命令: grep -v  "char"  file_name 匹配不包括"char"的文本 grep -n -w "char" ...

  3. shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中

    shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中 利用shell脚本将文本数据导入到mysql中 需求1:处理文本中的数据,将文本中的数据插入到mys ...

  4. shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容

    shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容 删除命令对照表 命令 含义 1d 删除第一行内容 ,10d 删除1行到10行的内容 ,+5d 删除10行到16行的内容 /p ...

  5. Linux学习笔记(17) Shell编程之基础

    1. 正则表达式 (1) 正则表达式用来在文件中匹配符合条件的字符串,正则是包含匹配.grep.awk.sed等命令可以支持正则表达式:通配符用来匹配符合条件的文件名,通配符是完全匹配.ls.find ...

  6. Linux Shell编程入门

    从程序员的角度来看, Shell本身是一种用C语言编写的程序,从用户的角度来看,Shell是用户与Linux操作系统沟通的桥梁.用户既可以输入命令执行,又可以利用 Shell脚本编程,完成更加复杂的操 ...

  7. Shell编程菜鸟基础入门笔记

    Shell编程基础入门     1.shell格式:例 shell脚本开发习惯 1.指定解释器 #!/bin/bash 2.脚本开头加版权等信息如:#DATE:时间,#author(作者)#mail: ...

  8. ****CodeIgniter使用cli模式运行,把php作为shell编程

    shell简介 在计算机科学中,Shell俗称壳(用来区别于核).而我们常说的shell简单理解就是一个命令行界面,它使得用户能与操作系统的内核进行交互操作. 常见的shell环境有:MS-DOS.B ...

  9. Linux Shell编程基础

    在学习Linux BASH Shell编程的过程中,发现由于不经常用,所以很多东西很容易忘记,所以写篇文章来记录一下 ls   显示当前路径下的文件,常用的有 -l 显示长格式  -a 显示所有包括隐 ...

随机推荐

  1. vue新手入门——vue-cli搭建

    首先说明,以下内容vue官网都有文档,如果觉得麻烦啰嗦,请移步至 安装-vue.js . 准备工作: 1.下载并安装node环境,一般情况下安装好node之后,npm也会安装好.具体安装的话,百度大概 ...

  2. [转载] 常用 Java 静态代码分析工具的分析与比较

    转载自http://www.oschina.net/question/129540_23043 简介: 本文首先介绍了静态代码分析的基本概念及主要技术,随后分别介绍了现有 4 种主流 Java 静态代 ...

  3. [转载] Hadoop MapReduce

    转载自http://blog.csdn.net/yfkiss/article/details/6387613和http://blog.csdn.net/yfkiss/article/details/6 ...

  4. ubuntu-16.04使用MDK3伪造wifi热点和攻击wifi热点至死

    MDK3是? MDK3 是一款无线DOS 攻击测试工具,能够发起Beacon Flood.Authentication DoS.Deauthentication/Disassociation Amok ...

  5. WinForm 窗体之间相互嵌套

    public FrmScan() { InitializeComponent(); Form1 frm = new Form1(); frm.Dock = DockStyle.Fill; frm.Fo ...

  6. platform 收集linux/windows操作系统信息

    调用python的platform模块 #!/usr/bin/evn python #_*_ coding:utf-8 -*- import platform print "######## ...

  7. Java compareTo() 方法

    以金钱实交(realPay),和使用预存(usePurseFee)为例: if ( realPay.compareTo(usePurseFee) <=0) { XXXXXXX; }else { ...

  8. net user命令详解

    net use \\ip\ipc$ " " /user:" " 建立IPC空链接 net use \\ip\ipc$ "密码" /user: ...

  9. smartClient 1--框架介绍

    一.是什么(以下简称SC)     smartClient 是一个基于web技术的开发框架,主要包括: 一个无需安装的 Ajax/HTML5 客户端引擎 UI组件和服务(采用富客户端RIA)--- 提 ...

  10. 《Linux命令行与shell脚本编程大全》第十四章 处理用户输入

    有时还会需要脚本能够与使用者交互.bash shell提供了一些不同的方法来从用户处获得数据, 包括命令行参数,命令行选项,以及直接从键盘读取输入的能力. 14.1 命令行参数 就是添加在命令后的数据 ...