awk、sed、grep三大shell文本处理工具之awk的应用

awk

1、是什么
是一个编程语言。支持变量、数组、函数、流程控制（if...else/for/while)
单行程序语言。

2、工作流程
读取file、标准输入、管道给的数据，从第一行开始读取，逐行读取，看是否匹配我们想要的数据（pattern模式匹配），对数据进行处理，直到读完所有的行，退出awk程序（执行的每一条awk的命令）

3、语法
awk [-F field seperator] 'pattern{action}' [file]

-F: 指定元数据列（字段）分隔符
‘pattern’：匹配模式
作用：匹配出来要处理的数据。
1）正则表达式
[root@localhost html]# awk '/^J/{print}' awk_scores.txt

2）关系表达式 > < >= <= == != ~ !~
[root@localhost html]# awk '$2>90{print}' awk_scores.txt

3) BEGIN模式在awk程序执行后，但尚未执行处理动作前要做的工作（定义变量）
[root@localhost html]# awk 'BEGIN{x="abc";print x}'

4）END模式 awk程序执行完处理动作后要做的工作。（善后）
[root@localhost html]# awk 'BEGIN{print "Scores report"} {print} END{print "Over!"}' awk_scores.txt

{action}：处理动作，针对符合匹配模式的数据进行的处理动作
如果没有pattern，只有action，会对所有的文本行执行action的处理动作
如果没有action，只有pattern，会打印出符合匹配模式的行

处理文本方式：
[file]: awk要处理的文本行（源数据），可以是其他命令的输出、管道过来的数据

4、指定awk的方式
1）命令行方式：
2）awk脚本：
#!/usr/bin/awk -f
BEGIN{
print "Scores Report"
}

{ print } //执行命令

END{
print "Over!"
}

[root@localhost html]# chmod u+x awk.awk
[root@localhost html]# ./awk.awk awk_scores.txt

3）文件
BEGIN{
print "Scores Report"
}

{ print }

END{
print "Over!"
}

[root@localhost html]# awk -f awk.awk awk_scores.txt

5、截取
行记录 Record
列字段 Field
字符串出现在行和列的交点上确定哪一行的哪一列

5.1 截取列
awk 内置变量：$n n为数字 $1 $2 $3 表示第几列
[root@localhost html]# awk '{print $1,$2}' awk_scores.txt

awk 内置变量：$NF 最后一列
[root@localhost html]# awk '{print $1,$NF}' awk_scores.txt
[root@localhost html]# ifconfig eth0 | awk 'NR==2{print $2}'|awk -F: '{print $2}'

5.2 截取行(行：NR)
1）NR numbers of record FNR当前读到的行 NR==1 FNR==1 都是指第一行
2）正则表达式
3）条件表达式

内置变量：$0 awk程序当前处理的行
[root@localhost html]# awk 'NR==2{print $0}' awk_scores.txt

示例：
[root@localhost html]# awk '/^N/{print}' awk_scores.txt
Nancy 89 90 73 82
[root@localhost html]# awk 'NR==2{print}' awk_scores.txt
Nancy 89 90 73 82
[root@localhost html]# awk 'FNR==2{print}' awk_scores.txt
Nancy 89 90 73 82
[root@localhost html]# awk '$3>=90{print}' awk_scores.txt
John 85 92 79 87
Nancy 89 90 73 82

5.3 截取字符串哪一行的第几列找到字符串(列：$n)

以下例子是截取第二行第5列的数据
[root@localhost html]# df -h |awk 'NR==2{print $5}'
49%

6、格式化输出
print 输出截取的数据，如果输出多列，列之间用“,”隔开-->输出后，变为空格
改变输出后的列值分割符号：

[root@localhost html]# awk '{print $3":"$4":"$1}' awk_scores.txt //冒号需要引起来
92:79:John
90:73:Nancy
88:92:Tom
65:83:Kity
89:80:Han
76:85:Kon

7、BEGIN

BEGIN 其中的代码在执行动作之前执行，程序运行后，其中的代码只执行一次。设置定义变量。
变量名对应单词意义

列：
FS　　 field separator 字段分隔符（awk处理的源文本的字段分隔符，默认空格或tab）
OFS　　 output field separator 输出的分隔符（默认空格）

行：
RS 　　 record separator 记录分隔符（默认换行符\n）
ORS　　 output record separator 输出记录换行符

awk 'BEGIN{FS=":"}{print $1}' /etc/passwd //在读取源文件时，让awk认为列的分隔符为“:”
awk 'BEGIN{FS=":"}{print $1,$2,$3}' /etc/passwd //输出多个列
awk 'BEGIN{FS=":";OFS="-"}{print $1,$2,$3}' /etc/passwd //输出时，指定输出列的分隔符为“-”
awk -F: '{print $1"-"$2"-"$3}' /etc/passwd（等同）

[root@localhost html]# cat awk_scores.txt
John 85 92 79 87
Nancy 89 90 73 82

Tom 81 88 92 81
Kity 79 65 83 90
Han 92 89 80 83
Kon 88 76 85 97

打印第一行：
[root@localhost html]# awk 'BEGIN{RS=""}NR==1{print}' awk_scores.txt //通过“RS=""”定义行的分隔符为空行

源文件：
[root@localhost html]# cat awk_contacts.txt
danny male
china beijing
(8103)82456789

jeck male
Japan Tokyo
(8103)82456789

xi female
America Washington
(8103)82456789

danny male china beijing (8103)82456789
jeck male Japan Tokyo (8103)82456789
xi female America Washington (8103)82456789

要求输出时，每个用户占一行，仅输出用户名、性别、联系电话

[root@localhost html]# awk 'BEGIN{RS="";FS="\n"}{print $1,$3}' awk_contacts.txt
danny male (8103)82456789
jeck male (8103)82456789
xi female (8103)82456789
//解析：通过RS=""控制行的分隔符为空行,再通过FS="\n"控制字段的分隔符为换行符（回车），然后打印第一列和第三列

例：vim c.txt
1
2
3
4
5

计算出第一行和第三行的和
[root@server ~]# awk 'BEGIN{RS="";FS="\n"}{print $1+$3}' c.txt
4

//解析：通过RS=""控制行的分隔符为空行,通过RS之后上述例子只能分出一行（所有数字变成一行了），再通过FS="\n"控制字段的分隔符为换行符（回车），然后打印第一列和第三列的和

8、回头看awk的工作过程
1）awk首选读取文件的第一行，将该行赋值给$0，默认行的分隔符是回车\n
2）通过空格（制表符）将行分割成多个字段（列），并将列值赋值给$n,$1 $2 $3
3) awk怎样直到列分隔符？运行程序之处，有一个内置变量FS来表示字段分隔符，程序初始化FS被定义为空格、制表符
4）print 打印（执行处理动作），OFS默认为空格
5）读取下一行

9、高级玩法
数学运算
比较运算
逻辑运算
变量
数组
流程控制 if for while (类C)

9.1 数学运算
+ - * / % ^ ** 注意:^号和**号都是指数运算
[root@localhost html]# awk 'BEGIN{print 1+1}'
2
[root@localhost html]# awk 'BEGIN{print 1-1}'
0
[root@localhost html]# awk 'BEGIN{print 10*2}'
20
[root@localhost html]# awk 'BEGIN{print 10/2}'
5
[root@localhost html]# awk 'BEGIN{print 10%2}'
0
[root@localhost html]# awk 'BEGIN{print 10**2}'
100
[root@localhost html]# awk 'BEGIN{print 10**3}'
1000
[root@localhost html]# awk 'BEGIN{print 10^3}'
1000
[root@localhost html]# awk 'BEGIN{print 2^10}'

[root@localhost html]# awk '{print $1,$2+$3+$4+$5}' awk_scores.txt
John 343
Nancy 334
Tom 342
Kity 317
Han 344
Kon 346

9.2 比较运算
> < >= <= == != ~ !~
awk '$2!=89{print}' awk_scores.txt
awk '/$1~^T/{print}' awk_scores.txt
awk '$1~/^T/{print}' awk_scores.txt
awk '$1~/^Han/{print}' awk_scores.txt

awk '$2!=92{print}' awk_scores.txt
awk '$2!~92{print}' awk_scores.txt

注意：~号表示匹配正则表达式，！~表示不匹配。如上最后一例是表示 "截取第二列所有不为92的数据。"

9.3 逻辑运算
与 &&
或 ||
非！

[root@localhost html]# awk '$2>80 && $3>80 && $4>80 && $5>80{print}' awk_scores.txt

9.4 变量
awk 命名方式 shell key=value 不允许数字开头
awk中变量如果没有没提前赋值，变量的初始值-->变量的类型（字符串和数值）
字符串 -->初始值“空”
数值 -->初始值“0”

如果定义变量即引用变量:
[root@localhost html]# awk 'BEGIN{n=1;print n}'
1

[root@localhost html]# awk 'BEGIN{var="abc";print var}'
abc

计算内存使用率：
[root@localhost html]# awk 'NR==1{t=$2}NR==2{f=$2;print (t-f)/t*100}' /proc/meminfo | awk -F. '{print $1}'

显示TCP进程连接状态及统计其数量：

[root@localhost html]# netstat -n |awk '/^tcp.*/{++b[$NF]}END{for (i in b) print i,b[i]}'
CLOSE_WAIT 2
ESTABLISHED 18
TIME_WAIT 277

9.5 数组
shell中
array=(1 2 3 4 5) 数组的下标从0 引用echo ${array[*]}

awk中
array[n]=value 数组的下标从1开始

awk引用数组：
array[n] n表示下标

定义数组array，赋值，引用：

[root@localhost html]# awk 'BEGIN{array[1]=10;array[2]=11;array[3]=12;print array[1],array[2],array[3]}'
10 11 12
[root@localhost html]# vim array.awk
[root@localhost html]# awk -f array.awk
A B C
[root@localhost html]# cat array.awk
BEGIN{
array[1]="A"
array[2]="B"
array[3]="C"

print array[1],array[2],array[3]
}
[root@localhost html]# awk -f array.awk
A B C

9.6 if语句
if (expression)
{
statement
}
else
{
statement
}

示例：

#!/bin/awk -f --->#!/usr/bin/awk -f

{

if ($>=)

{ print $,"Pass"}

else

{ print $,"Nopass"}

}

之判断为真的情况

#!/bin/awk -f

{

if ($>=)

{ print $,"Pass"}

}

[root@localhost html]# awk '{
> if($2>=80)
> {print $1,"Pass"}
> }' awk_scores.txt
John Pass
Nancy Pass
Tom Pass
Han Pass
Kon Pass

9.7 for

for循环为数组赋值

#!/bin/awk -f

BEGIN{

for(i=;i<;i++)

{

array[i]=i

print array[i]

}

}

累加运算

#!/bin/awk -f

{

sum=

for(i=;i<;i++)

{

sum+=$i

}

print sum

}

while

格式：

while(表达式)

{语句}

例：

#!/bin/awk -f

BEGIN{

a=;

b=;

while (i<=)

{

b=a+b;

i++;

}

print b;

}

awk、sed、grep三大shell文本处理工具之awk的应用的更多相关文章

awk、sed、grep三大shell文本处理工具之sed的应用
sed 流编辑器对文本中的行,逐行处理非交互式的编辑器是一个编辑器 1.工作流程 1)将文件的第一行读入到自己的缓存空间(模式空间--pattern space),删除掉换行符 2)匹配,看一下 ...
awk、sed、grep三大shell文本处理工具之grep的应用
1.基本格式grep pattern [file...](1)grep 搜索字符串 [filename](2)grep 正则表达式 [filename]在文件中搜索所有 pattern 出现的位置, ...
Linux Shell 文本处理工具集锦--Awk―sed―cut(row-based, column-based),find、grep、xargs、sort、uniq、tr、cut、paste、wc
本文将介绍Linux下使用Shell处理文本时最常用的工具:find.grep.xargs.sort.uniq.tr.cut.paste.wc.sed.awk:提供的例子和参数都是最常用和最为实用的: ...
Linux 三剑客 -- awk sed grep
本文由本人收集整理自互联网供自己与网友参考,参考文章均已列出,如有侵权,请告知! 顶配awk,中配sed,标配grep awk 参考 sed 参考 grep 参考在线查看linux命令速记表 app ...
shell文本处理工具总结
shell文本处理工具总结为了效率,应该熟练的掌握自动化处理相关的知识和技能,能力就表现在做同样的一件事情,可以做的很好的同时,耗时还很短. 再次总结shell文本处理的相关规则,对提高软件调试效率 ...
Linux shell文本处理工具
搞定Linux Shell文本处理工具,看完这篇集锦就够了 Linux Shell是一种基本功,由于怪异的语法加之较差的可读性,通常被Python等脚本代替.既然是基本功,那就需要掌握,毕竟学习She ...
cut printf awk sed grep笔记
名称作用参数实例 cut 截取某列,可指定分隔 -f 列号 -d 分隔符 cut -d ":" -f 1, 3 /etc/passwd 截取第一列和第三列 printf pr ...
Unix文本处理工具之awk
Unix命令行下输入的命令是文本,输出也都是文本.因此,掌握Unix文本处理工具是很重要的一种能力.awk是Unix常用的文本处理工具中的一种,它是以其发明者(Aho,Weinberger和Kerni ...
awk\sed\grep 补充
# awk\sed\grep 补充以上命令中字符 / 在sed中作为定界符使用,也可以使用任意的定界符 sed's:test:TEXT:g' sed's|test|TEXT|g' 定界符出现在样式内 ...

随机推荐

Linux系统学习之网络管理
网络接口配置使用ifconfig检查和配置网卡 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ...
PAT A1132 Cut Integer （20 分）——数学题
Cutting an integer means to cut a K digits lone integer Z into two integers of (K/2) digits long int ...
Vue组件基础
<!DOCTYPE html><html> <head> <meta charset="utf-8"> ...
Objective-C 事件响应链
苹果app使用响应者对象(responder object)来接收和处理事件.响应者对象是NSResponder及其子类的实例,如NSView.NSViewController和NSApplicati ...
ASP.NET 文件操作类
1.读取文件 2.写入文件 using System; using System.Collections.Generic; using System.IO; using System.Linq; us ...
eclipse打断点的调试
对于程序员来说,最重要的技能之一其实是在发现问题的时候,定位问题,然后才能解决问题. 发现问题的能力十分的重要.而debug的水平就是基础. 打断点之后,操作相应的步骤,然后eclipse会跳转到相应 ...
php的foreach中使用取地址符，注意释放
先来举个例子: <?php $array = array(1, 2, 3); foreach ($array as &$value) {} // unset($value); forea ...
EZ 2018 06 24 NOIP2018 模拟赛（二十）
很久之前写的一套题了,由于今天的时间太多了,所以记起来就写掉算了. 这一场尽管T2写炸了,但也莫名Rank4涨了Rating.不过还是自己太菜. A. 环游世界首先我们先排个序,想一下如果不用走回来 ...
Bash 中常见的字符串操作
获取字符串长度 ${#string} MyString=abcABC123ABCabc 注意这会自动去掉字符串结尾处的空格,如果在字符串中包含空格(开头.中间或结尾),就需要使用引号把字符串包裹起来: ...
C_数据结构_递归不同函数间调用
# include <stdio.h> void f(); void g(); void k(); void f() { printf("FFFF\n"); g(); ...

awk、sed、grep三大shell文本处理工具之awk的应用

awk、sed、grep三大shell文本处理工具之awk的应用的更多相关文章

随机推荐

热门专题