centos下对文件某些特定字符串分组统计出现次数
假如现有数据:
{ "@timestamp": "2018-10-13T21:55:58+08:00", "remote_addr": "100.120.34.3", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.076, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.3" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.076 }
{ "@timestamp": "2018-10-13T21:56:06+08:00", "remote_addr": "100.120.34.101", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.076, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.101" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.076 }
{ "@timestamp": "2018-10-13T21:56:08+08:00", "remote_addr": "100.120.34.29", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.075, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.29" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.075 }
{ "@timestamp": "2018-10-13T21:56:10+08:00", "remote_addr": "100.120.34.75", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=D166459D-E823-4847-9094-6F4BF90625B2&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.078, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.75" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.078 }
{ "@timestamp": "2018-10-13T21:56:18+08:00", "remote_addr": "100.120.34.39", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=08C65C3B-EED2-4A65-B0C1-67FC7FB78E18&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.082, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.39" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.082 }
{ "@timestamp": "2018-10-13T21:56:31+08:00", "remote_addr": "100.120.34.68", "referer": "-", "request": "GET /api/gourd/activeupload?idfa=D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.079, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.68" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.079 }
临时保存为tmp.log
awk -F 'idfa=' '{print $2}' tmp.log
出现如下结果:
58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.076, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.3" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.076 }
E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.076, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.101" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.076 }
D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206&source=rehulu HTTP/1.1", "status": 200, "request_time": 0.075, "cookie":"-","host":"cms.369wan.com","bytes": 48, "agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0", "proxy_x_forwarded": "139.129.97.187, 100.120.34.29" "upstr_addr": "127.0.0.1:9000","upstr_host": "-","ups_resp_time": 0.075 }
执行
awk -F 'idfa=' '{print $2}' tmp.log | awk -F '&source=' '{print $1}'
出现如下结果:
58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206
E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206
D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206
执行
awk -F 'idfa=' '{print $2}' tmp.log | awk -F '&source=' '{print $1}' | sort (这一步将结果相同的放在一起)
出现如下结果:
58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206
E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206
D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206
执行
awk -F 'idfa=' '{print $2}' cms_\(2\).log | awk -F '&source=' '{print $1}' |sort| uniq -c
出现最终结果(次数 和 各字符串):
1 58237FA9-A1B3-4202-B5F3-9536983119E5&appid=1410137206
1 E9D7F87A-9042-46B4-82E8-E5F64B74466B&appid=1410137206
1 D5B924F3-7D25-4B52-BAE9-3270B08EA32D&appid=1410137206
centos下对文件某些特定字符串分组统计出现次数的更多相关文章
- centos下修改文件后如何保存退出
centos下修改文件后如何保存退出 保存命令 按ESC键 跳到命令模式,然后: :w 保存文件但不退出vi :w file 将修改另外保存到file中,不退出vi :w! 强制保存,不推出vi :w ...
- 在文件夹下所有文件中查找字符串(linux/windows)
在linux下可以用 grep "String" filename.txt#字符串 文件名grep -r "String" /home/#递归查找目录下所有文件 ...
- MyEclipse如何查找指定工程下所有或指定文件中特定字符串并且可进行批量替换
查找操作步骤:(1)在myEclipse里菜单选择-Search-Search(快捷键:ctrl+h);(2)在弹出对话框中选File Search选项,然后在第一个文本框中输入“要查找的字符串”(为 ...
- CentOS下查看文件和文件夹大小
当磁盘大小超过标准时会有报警提示,这时如果掌握df和du命令是非常明智的选择. df可以查看一级文件夹大小.使用比例.档案系统及其挂入点,但对文件却无能为力. 当磁盘大小超过标准时会有报警提示,这时如 ...
- Centos下删除文件后空间并未释放
[root@DeviceSP /]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda3 90G 82G 3.8G 96% / tmp ...
- C#替换文件中特定字符串,按照原来的编码格式保存
private void button1_Click(object sender, EventArgs e) { var txt1 = "E:\\Temp\\local"; str ...
- IO复制多级目录 控制台输入文件目录然后把目录下java文件复制到 D: 并统计java个数
package cn.itcast_05; import java.io.BufferedInputStream; import java.io.BufferedOutputStream; impor ...
- 【转】【Linux】Linux下统计当前文件夹下的文件个数、目录个数
[转][Linux]Linux下统计当前文件夹下的文件个数.目录个数 统计当前文件夹下文件的个数,包括子文件夹里的 ls -lR|grep "^-"|wc -l 统计文件夹下目录的 ...
- DataTable、List使用groupby进行分组和分组统计;List、DataTable查询筛选方法
DataTable分组统计: .用两层循环计算,前提条件是数据已经按分组的列排好序的. DataTable dt = new DataTable(); dt.Columns.AddRange(new ...
随机推荐
- WIN7 PHP环境 WAMP一键安装
PHP环境自己搭建比较麻烦,需要配置APACHE,PHP,MYSQL,更改一堆.ini文件配置 所以使用一键安装包比较好,省时省力省心. WAMP 是 WIN+APACHE+MYSQL+PHP 一键安 ...
- Caliburn.Micro 资源随时添加
Caliburn.Micro – Hello World http://buksbaum.us/2010/08/01/caliburn-micro-hello-world/ http://blog.c ...
- c# 多线程线程池基础
线程池的作用 在上一篇中我们了解了创建和销毁线程是一个昂贵的操作,要耗费大量的时间,太多的线程会浪费内存资源,当线程数量操作计算机CPU的数量后操作系统必须调度可运行的线程并执行上下文切 ...
- C#读取MP3文件的专辑图片和ID3V2Tag信息(带代码)
第二次更新,后面的代码有问题,有些专辑图片读取不到.发现是PNG图片的问题.在读取的过程中调试发现,图片帧前10个字节包含了图片的格式,在有些歌曲写着JPEG的格式,数据却是PNG的.先说下思路. j ...
- RoadFlowCore工作流引擎快速入门
RoadFlow新建一个流程分为以下几步: 1.建表 在数据库建一张自己的业务表(根据你自己的业务需要确定表字段,如请假流程就有,请假人.请假时间.请假天数等字段),数据表必须要有一个主键,主键类型是 ...
- C#中实现对象的深拷贝
深度拷贝指的是将一个引用类型(包含该类型里的引用类型)拷贝一份(在内存中完完全全是两个对象,没有任何引用关系).......... 直接上代码: /// <summary> /// 对象的 ...
- 跨域处理之Jsonp
一.认识Jsonp JSONP是一个非官方的协议,它允许在服务器端集成Script tags返回至客户端,通过javascript callback的形式实现跨域访问(这仅仅是JSONP简单的实现形式 ...
- 【总结】 BZOJ前100题总结
前言 最近发现自己trl,所以要多做题目但是Tham布置的题目一道都不会,只能来写BZOJ HA(蛤)OI 1041 复数可以分解成两个点,所以直接把\(R^2\)质因数分解一下就可以了,注意计算每一 ...
- django系列8.4--django中间件的可应用案例, 限制请求次数与时间
应用案例 1.做IP访问频率限制 某些IP访问服务器的频率过高,进行拦截, 比如每分钟不能超过20次 2.URL访问过滤 如果用户访问的是login视图,就允许请求 如果访问其他视图, 需要检测是不是 ...
- FusionCharts的使用方法 - 公司所用的flash式的图像统计工具
我们公司一直用这个图表统计, 最近整理了一下相关文档,提供大家学习. 首先可以看看 http://www.cnblogs.com/xuhongfei/archive/2013/04/12/301688 ...