centos下对文件某些特定字符串分组统计出现次数
假如现有数据:
{ "@timestamp": "2018-10-13T21:55:58+08:00", "remote_addr": "100.120.34.3", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.076, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.3" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.076 }
{ "@timestamp": "2018-10-13T21:56:06+08:00", "remote_addr": "100.120.34.101", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.076, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.101" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.076 }
{ "@timestamp": "2018-10-13T21:56:08+08:00", "remote_addr": "100.120.34.29", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.075, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.29" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.075 }
{ "@timestamp": "2018-10-13T21:56:10+08:00", "remote_addr": "100.120.34.75", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=D166459D-E823-4847-9094-6F4BF90625B2&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.078, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.75" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.078 }
{ "@timestamp": "2018-10-13T21:56:18+08:00", "remote_addr": "100.120.34.39", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=08C65C3B-EED2-4A65-B0C1-67FC7FB78E18&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.082, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.39" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.082 }
{ "@timestamp": "2018-10-13T21:56:31+08:00", "remote_addr": "100.120.34.68", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.079, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.68" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.079 }
临时保存为tmp.log
awk -F 'idfa=' '{print $2}' tmp.log
出现如下结果:
58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.076, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.3" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.076 }
E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.076, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.101" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.076 }
D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.075, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.29" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.075 }
执行
awk -F 'idfa=' '{print $2}' tmp.log | awk -F '&source=' '{print $1}'
出现如下结果:
58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206
E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206
D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206
执行
awk -F 'idfa=' '{print $2}' tmp.log | awk -F '&source=' '{print $1}' | sort (这一步将结果相同的放在一起)
出现如下结果:
58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206
E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206
D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206
执行
awk -F 'idfa=' '{print $2}' cms_\(2\).log | awk -F '&source=' '{print $1}' |sort| uniq -c
出现最终结果(次数 和 各字符串):
1 58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206
1 E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206
1 D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206
centos下对文件某些特定字符串分组统计出现次数的更多相关文章
- centos下修改文件后如何保存退出
		
centos下修改文件后如何保存退出 保存命令 按ESC键 跳到命令模式,然后: :w 保存文件但不退出vi :w file 将修改另外保存到file中,不退出vi :w! 强制保存,不推出vi :w ...
 - 在文件夹下所有文件中查找字符串(linux/windows)
		
在linux下可以用 grep "String" filename.txt#字符串 文件名grep -r "String" /home/#递归查找目录下所有文件 ...
 - MyEclipse如何查找指定工程下所有或指定文件中特定字符串并且可进行批量替换
		
查找操作步骤:(1)在myEclipse里菜单选择-Search-Search(快捷键:ctrl+h);(2)在弹出对话框中选File Search选项,然后在第一个文本框中输入“要查找的字符串”(为 ...
 - CentOS下查看文件和文件夹大小
		
当磁盘大小超过标准时会有报警提示,这时如果掌握df和du命令是非常明智的选择. df可以查看一级文件夹大小.使用比例.档案系统及其挂入点,但对文件却无能为力. 当磁盘大小超过标准时会有报警提示,这时如 ...
 - Centos下删除文件后空间并未释放
		
[root@DeviceSP /]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda3 90G 82G 3.8G 96% / tmp ...
 - C#替换文件中特定字符串,按照原来的编码格式保存
		
private void button1_Click(object sender, EventArgs e) { var txt1 = "E:\\Temp\\local"; str ...
 - IO复制多级目录 控制台输入文件目录然后把目录下java文件复制到 D: 并统计java个数
		
package cn.itcast_05; import java.io.BufferedInputStream; import java.io.BufferedOutputStream; impor ...
 - 【转】【Linux】Linux下统计当前文件夹下的文件个数、目录个数
		
[转][Linux]Linux下统计当前文件夹下的文件个数.目录个数 统计当前文件夹下文件的个数,包括子文件夹里的 ls -lR|grep "^-"|wc -l 统计文件夹下目录的 ...
 - DataTable、List使用groupby进行分组和分组统计;List、DataTable查询筛选方法
		
DataTable分组统计: .用两层循环计算,前提条件是数据已经按分组的列排好序的. DataTable dt = new DataTable(); dt.Columns.AddRange(new ...
 
随机推荐
- hive执行结果moveTask操作失败
			
hive执行结果moveTask操作失败 Apache Hive 2.1.0 ,在执行"INSERT OVERWRITE TABLE ...... select "或者 " ...
 - 虚拟化 - VMware
			
和VirtualBox一样,也需要关掉Hyper-V才能启动虚拟机,否则会报Guard的错误. 网络 [转]VMware网络连接模式-桥接.NAT以及仅主机模式的详细介绍和区别 桥接 就好像在局域网中 ...
 - UWP开发入门(二)——RelativePanel
			
RelativePanel也是Win10 UWP新增的控件,和上篇提到的SplitView一样在UWP的UI布局起到非常重要的作用.说句实在话,这货其实就是为了UWP的Adaptive UI而特意增加 ...
 - console使用技巧
			
http://heikezhi.com/yuanyi/10%E4%B8%AAchrome%20console%E5%AE%9E%E7%94%A8%E5%B0%8F%E6%8A%80%E5%B7%A7 ...
 - Delphi XE7实现的任意位置弹出菜单
			
Delphi XE7中目前还没有弹出菜单组件,这个弹出菜单应用很普遍,在JAVA开发的安卓程序中很简单就可以用上了,应该说是一个标准控件.看了一些例子,但是都不能满足我想在任意位置弹出菜单需求,于是自 ...
 - 【Oracle 12c】最新CUUG OCP-071考试题库(59题)
			
59.(16-8)choose two: Which two statements are true regarding the USING and ON clauses in table joins ...
 - C# Winform 小技巧(Datagridview某一列按状态显示不同图片)
			
步骤: 一.导入状态图片到项目中: 二.在窗体中声明一个图片数组,并在窗体的OnLoad事件中加入图片资源: /// <summary> /// 存储状态图片序列,避免同一状态对图片重复读 ...
 - 喝最烈的酒、挖最大的DONG——工具与技巧篇
			
本文作者:i春秋签约作家——黑色镰刀 0×00 前言 在这个科技发达的时代,很多时候工具都可以代替人做很多事情,之前我就有谈起过有企业将人工智能运用于网络安全方面,像如今,也有更多更人性化更智能的工具 ...
 - ElasticSearch基本查询
			
词条查询 这是一个简单查询.它仅 匹配给定字段中包含该词条的稳定,且是2未经分析的确切的词条. { “query” :{ “term”:{ “title”:”crime” } } } 多词条查询 匹配 ...
 - 零基础学习Python数据分析
			
网上虽然有很多Python学习的教程,但是大多是围绕Python网页开发等展开.数据分析所需要的Python技能和网页开发等差别非常大,本人就是浪费了很多时间来看这些博客.书籍.所以就有了本文,希望能 ...