http://how-to.linuxcareer.com/learning-linux-commands-awk

1. Introduction

In this case, the title might be a little misleading. And that is because awk is more than a command, it's a programming language in its own right. You can write awk scripts for complex operations or you can use awk from the command line. The name stands for Aho, Weinberger and Kernighan (yes, Brian Kernighan), the authors of the language, which was started in 1977, hence it shares the same Unix spirit as the other classic *nix utilities. If you're getting used to C programming or know it already, you will see some familiar concepts in awk, especially since the 'k' in awk stands for the same person as the 'k' in K&R, the C programming bible. You will need some command-line knowledge in Linux and possibly some scripting basics, but the last part is optional, as we will try to offer something for everybody. Many thanks to Arnold Robbins for all his work involved in awk.

2. What is it that awk does?

awk is a utility/language designed for data extraction. If the word "extraction" rings a bell, it should because awk was one Larry Wall's inspirations when he created Perl. awk is often used with sed to perform useful and practical text manipulation chores, and it depends on the task if you should use awk or Perl, but also on personal preference. Just as sed, awk reads one line at a time, performs some action depending on the condition you give it and outputs the result. One of the most simple and popular uses of awk is selecting a column from a text file or other command's output. One thing I used to do with awk was, if I installed Debian on my second workstation, to get a list of the installed software from my primary box then feed it to aptitude. For that, I did something like that:

  $ dpkg -l | awk ' {print $2} ' > installed

Most package managers today offer this facility, for example rpm's -qa options, but the output is more than I want. I see that the second column of dpkg -l 's output contains the name of the packages installed, so this is why I used $2 with awk: to get me only the 2nd column.

3. Basic concepts

As you have noticed, the action to be performed by awk is enclosed in braces, and the whole command is quoted. But the syntax is awk ' condition { action }'. In our example, we had no condition, but if we wanted to, say, check only for vim-related packages installed (yes, there is grep, but this is an example, plus why use two utilities when you can only use one?), we would have done this:

 $ dpkg -l | awk ' /'vim'/ {print $2} '

This command would print all packages installed that have "vim" in their names. One thing that recommend awk is that it's fast. If you replace "vim" with "lib", on my system that yields 1300 packages. There will be situations where the data you'll have to work with will be much bigger, and that's one part where awk shines. Anyway, let's start with the examples, and we will explain some concepts as we go. But before that, it would be good to know that there are several awk dialects and implementations, and the examples presented here deal with GNU awk, as an implementation and dialect. And because of various quoting issues, we assume you're using bash, ksh or sh, we don't support (t)csh.

4. Examples

Learning Linux awk command with examples
Linux command syntax Linux command description
 
awk ' {print $1,$3} '
Print only columns one and three using stdin
awk ' {print $0} '      
Print all columns using stdin
awk ' /'pattern'/ {print $2} '
Print only elements from column 2 that match pattern using stdin
awk -f script.awk inputfile
Just like make or sed, awk uses -f to get its' instructions from a file, useful when there is a lot to be done and using the terminal would be impractical
awk ' program ' inputfile
Execute program using data from inputfile
awk "BEGIN { print \"Hello, world!!\" }"
Classic "Hello, world" in awk
awk '{ print }'
Print what's entered on the command line until EOF (^D)
#! /bin/awk -f
BEGIN { print "Hello, world!" }
awk script for the classic "Hello, world!" (make it executable with chmod and run it as-is)
# This is a program that prints \
"Hello, world!"
# and exits
Comments in awk scripts
awk -F "" 'program' files
Define the FS (field separator) as null, as opposed to white space, the default
awk -F "regex" 'program' files
FS can also be a regular expression
awk 'BEGIN { print "Here is a single \
quote <'\''>" }'
Will print <'>. Here's why we prefer Bourne shells. :)
awk '{ if (length($0) > max) max = \
length($0) }
END { print max }' inputfile
Print the length of the longest line
awk 'length($0) > 80' inputfile
Print all lines longer than 80 characters
awk 'NF > 0' data
Print every line that has at least one field (NF stands for Number of Fields)
awk 'BEGIN { for (i = 1; i <= 7; i++)
print int(101 * rand()) }'
Print seven random numbers from 0 to 100
ls -l . | awk '{ x += $5 } ; END \
{ print "total bytes: " x }'
total bytes: 7449362
Print the total number of bytes used by files in the current directory
ls -l . | awk '{ x += $5 } ; END \
{ print "total kilobytes: " (x + \
1023)/1024 }'
total kilobytes: 7275.85
Print the total number of kilobytes used by files in the current directory
awk -F: '{ print $1 }' /etc/passwd | sort
Print sorted list of login names
awk 'END { print NR }' inputfile
Print number of lines in a file, as NR stands for Number of Rows
awk 'NR % 2 == 0' data
Print the even-numbered lines in a file.
How would you print the odd-numbered lines?
ls -l | awk '$6 == "Nov" { sum += $5 }
END { print sum }'
Prints the total number of bytes of files that were last modified in November
awk '$1  ̃/J/' inputfile
Regular expression matching all entries in the first field that start with a capital j
awk '$1  ̃!/J/' inputfile
Regular expression matching all entries in the first field that don't start with a capital j
awk 'BEGIN { print "He said \"hi!\" \
to her." }'
Escaping double quotes in awk
echo aaaabcd | awk '{ sub(/a+/, \
"<A>"); print }'
Prints "<A>bcd"
ls -lh | awk '{ owner = $3 ; $3 = $3 \
" 0wnz"; print $3 }' | uniq
Attribution example; try it :)
awk '{ $2 = $2 - 10; print $0 }' inventory
Modify inventory and print it, with the difference being that the value of the second field will be lessened by 10
awk '{ $6 = ($5 + $4 + $3 + $2); print \
$6' inventory
Even though field six doesn't exist in inventory, you can create it and assign values to it, then display it
echo a b c d | awk '{ OFS = ":"; $2 = ""
> print $0; print NF }'
OFS is the Output Field Separator and the command will output "a::c:d" and "4" because although field two is nullified, it still exists so it gets counted
echo a b c d | awk ’{ OFS = ":"; \
$2 = ""; $6 = "new"
> print $0; print NF }’
Another example of field creation; as you can see, the field between $4 (existing) and $6 (to be created) gets created as well (as $5 with an empty value), so the output will be "a::c:d::new" "6"
echo a b c d e f | awk ’\
{ print "NF =", NF;
> NF = 3; print $0 }’
Throwing away three fields (last ones) by changing the number of fields
FS=[ ]
This is a regular expression setting the field separator to space and nothing else (non-greedy pattern matching)
echo ' a b c d ' |  awk 'BEGIN { FS = \
"[ \t\n]+" }
> { print $2 }'
This will print only "a"
awk -n '/RE/{p;q;}' file.txt
Print only the first match of RE (regular expression)
awk -F\\\\ ’...’ inputfiles ...
Sets FS to \\
BEGIN { RS = "" ; FS = "\n" }
{
print "Name is:", $1
print "Address is:", $2
print "City and State are:", $3
print ""
}
If we have a record like
"John Doe
1234 Unknown Ave.
Doeville, MA", this script sets the field separator to 
newline so it can easily operate on rows
awk ’BEGIN { OFS = ";"; ORS = "\n\n" }
> { print $1, $2 }’ inputfile
With a two-field file, the records will be printed like this:
"field1:field2

field3;field4

...;..."
Because ORS, the Output Record Separator, is set to two newlines and OFS is ";"

awk ’BEGIN {
> OFMT = "%.0f" # print numbers as \
integers (rounds)
> print 17.23, 17.54 }’
This will print 17 and 18, because the Output ForMaT is set to round floating point values to the closest integer value
awk ’BEGIN {
> msg = "Dont Panic!"
> printf "%s\n", msg
>} '
You can use printf mainly how you use it in C
awk ’{ printf "%-10s %s\n", $1, \
$2 }’ inputfile
Prints the first field as a 10-character string, left-justified, and $2 normally, next to it
awk ’BEGIN { print "Name  Number"
print "---- ------" }
{ printf "%-10s %s\n", $1, \
$2 }’ inputfile
Making things prettier
awk ’{ print $2 > "phone-list" }' \
inputfile
Simple data extraction example, where the second field is written to a file named "phone-list"
awk ’{ print $1 > "names.unsorted"
command = "sort -r > names.sorted"
print $1 | command }’ inputfile
Write the names contained in $1 to a file, then sort and output the result to another file (you can also append with >>, like you would in a shell)
awk ’BEGIN { printf "%d, %d, %d\n", 011, 11, \
0x11 }’
Will print 9, 11, 17
if (/foo/ || /bar/)
print "Found!"
Simple search for foo or bar
awk ’{ sum = $2 + $3 + $4 ; avg = sum / 3
> print $1, avg }’ grades

Simple arithmetic operations (most operators resemble C a lot)

awk '{ print "The square root of", \
$1, "is", sqrt($1) }'
2
The square root of 2 is 1.41421
7
The square root of 7 is 2.64575

Simple, extensible calculator

awk ’$1 == "start", $1 == "stop"’ inputfile

Prints every record between start and stop

awk ’
> BEGIN { print "Analysis of \"foo\"" }
> /foo/ { ++n }
> END { print "\"foo\" appears", n,\
"times." }’ inputfile

BEGIN and END rules are executed exactly once, before and after any record processing

echo -n "Enter search pattern: "
read pattern
awk "/$pattern/ "’{ nmatches++ }
END { print nmatches, "found" }’ inputfile

Search using shell

if (x % 2 == 0)
print "x is even"
else
print "x is odd"

Simple conditional. awk, like C, also supports the ?: operators

awk ’{ i = 1
while (i <= 3) {
print $i
i++
}
}’ inputfile

Prints the first three fields of each record, one per line.

awk ’{ for (i = 1; i <= 3; i++)
print $i
}’
Prints the first three fields of each record, one per line.
BEGIN {
if (("date" | getline date_now) <= 0) {
print "Can’t get system date" > \
"/dev/stderr"
exit 1
}
print "current date is", date_now
close("date")
}
Exiting with an error code different from 0 means something's not quite right. Here's and example
awk ’BEGIN {
> for (i = 0; i < ARGC; i++)
> print ARGV[i]
> }’ file1 file2
Prints awk file1 file2
for (i in frequencies)
delete frequencies[i]
Delete elements in an array
foo[4] = ""
if (4 in foo)
print "This is printed, even though foo[4] \
is empty"
Check for array elements
function ctime(ts, format)
{
format = "%a %b %d %H:%M:%S %Z %Y"
if (ts == 0)
ts = systime()
# use current time as default
return strftime(format, ts)
}
An awk variant of ctime() in C. This is how you define your own functions in awk
BEGIN { _cliff_seed = 0.1 }
function cliff_rand()
{
_cliff_seed = (100 * log(_cliff_seed)) % 1
if (_cliff_seed < 0)
_cliff_seed = - _cliff_seed
return _cliff_seed
}
A Cliff random number generator
cat apache-anon-noadmin.log | \
awk 'function ri(n) \
{ return int(n*rand()); } \
BEGIN { srand(); } { if (! \
($1 in randip)) { \
randip[$1] = sprintf("%d.%d.%d.%d", \
ri(255), ri(255)\
, ri(255), ri(255)); } \
$1 = randip[$1]; print $0 }'
Anonymize an Apache log (IPs are randomized)

5. Conclusion

As you can see, with awk you can do lots of text processing and other nifty stuff. We didn't get into more advanced topics, like awk's predefined functions, but we showed you enough (we hope) to start remembering it as a powerful tool.

Learning Linux Commands: awk--reference的更多相关文章

  1. 10 Linux Commands Every Developer Should Know

    转载:http://azer.bike/journal/10-linux-commands-every-developer-should-know/ As a software engineer, l ...

  2. 【转】如何利用多核CPU来加速你的Linux命令 — awk, sed, bzip2, grep, wc等

    如何利用多核CPU来加速你的Linux命令 — awk, sed, bzip2, grep, wc等   你是否曾经有过要计算一个非常大的数据(几百GB)的需求?或在里面搜索,或其它操作——一些无法并 ...

  3. 性能工具之linux三剑客awk、grep、sed详解

    前言 linux 有很多工具可以做文本处理,例如:sort, cut, split, join, paste, comm, uniq, column, rev, tac, tr, nl, pr, he ...

  4. linux之awk

    相较于sed 常常作用于一整个行的处理,awk 则比较倾向于一行当中分成数个『字段』来处理. 因此,awk 相当的适合处理小型的数据数据处理呢! awk 通常运作的模式是这样的: [root@linu ...

  5. Linux下undefined reference to ‘pthread_create’问题解决

    Linux下undefined reference to 'pthread_create'问题解决 在试用Linux 线程模块时,试用pthread_create 函数. 编译命令为 gcc main ...

  6. linux中awk的使用

    在linux中awk绝对是核心工具,特别是在查找搜索这一领域,和掌握sed命令一样重要 下面为awk的一些基本知识,基于这些知识,可以让你随意操控一个文件: 在awk中:()括号为条件块,{}为执行的 ...

  7. linux 的 awk 使用

    linux中awk命令对文本内容进行操作,其功能十分强大 1.如:查看一个有几百万行内容的文件中第3列数字内容(不重复) cat test.csv | awk -F ',' '{print $3}' ...

  8. Linux中awk后面的RS, ORS, FS, OFS 用法

    Linux中awk后面的RS, ORS, FS, OFS 含义 一.RS 与 ORS 差在哪   我们经常会说,awk是基于行列操作文本的,但如何定义“行”呢?这就是RS的作用.  默认情况下,RS的 ...

  9. [转帖]Linux中awk工具的使用

    Linux中awk工具的使用 2018年10月09日 17:26:20 谢公子 阅读数 2170更多 分类专栏: linux系统安全   版权声明:本文为博主原创文章,遵循CC 4.0 BY-SA版权 ...

随机推荐

  1. Arcgis Android 常见问题

    关于arcgis android 安装包较大的问题 如果想缩小大小,可以只保留armeabi,只是这样就不支持x86类型cpu的手机了. 可以考虑做成单独的版本,供用户下载. 即打2个包,一个供普通a ...

  2. atan2()如何转换为角度

    atan2()如何转换为角度 Math.atan2()函数返回点(x,y)和原点(0,0)之间直线的倾斜角.那么如何计算任意两点间直线的倾斜角呢?只需要将两点x,y坐标分别相减得到一个新的点(x2-x ...

  3. STM32 IAP+Ymodem功能实现(参考官方代码)

    IAP:在线升级代码 ,通俗的讲就是通过USART,IIC,或者SPI,USB等等,方式,在程序中升级程序,一般用在远程升级,或者是在PCB板子都安装到模具之后还需要升级代码,这样我们就需要,通过IA ...

  4. Windows上编译OpenShadingLanguage

    将OSL 1.3.0解压到[工作目录]/osl/OpenShadingLanguage 对Debug使用如下bat生成项目文件: @Echo off cd OpenShadingLanguage se ...

  5. MongoDB集群搭建-分片

    MongoDB集群搭建-分片 一.场景: 1,机器的磁盘不够用了.使用分片解决磁盘空间的问题. 2,单个mongod已经不能满足写数据的性能要求.通过分片让写压力分散到各个分片上面,使用分片服务器自身 ...

  6. Question | 网站被黑客扫描撞库该怎么应对防范?

    本文来自网易云社区 在安全领域向来是先知道如何攻,其次才是防.针对题主的问题,在介绍如何防范网站被黑客扫描撞库之前,先简单介绍一下什么是撞库. 撞库是黑客通过收集互联网已泄露的用户和密码信息,生成对于 ...

  7. SPOJ 3267 DQUERY(离线+树状数组)

    传送门 话说这好像HH的项链啊…… 然后就说一说上次看到的一位大佬很厉害的办法吧 对于所有$r$相等的询问,需要统计有多少个不同的数,那么对于同一个数字,我们只需要关心它最右边的那一个 比如$1,2, ...

  8. exec和xargs

    参考:http://www.cnblogs.com/itxdm/p/5936907.html 一. 先复习下find命令 1. name参数 find -name tom 或 find -iname ...

  9. Ionic无法通过npm安装解决方案

    http://www.jianshu.com/p/5a99334eb62d 一般从 node.js官网下载安装完之后,npm也会同时安装完. 如果通过 $ npm install -g cordova ...

  10. Hibernate 初识

    第一步: 导包:(这是我根据其他网站的介绍导入的包,可能不完善,但开发没什么问题,遇到问题再说) 当然还有mysql的jar包 第二步:进行hibernate环境配置 在classpath目录下建立h ...