Learning Linux Commands: awk--reference
http://how-to.linuxcareer.com/learning-linux-commands-awk
1. Introduction
In this case, the title might be a little misleading. And that is because awk is more than a command, it's a programming language in its own right. You can write awk scripts for complex operations or you can use awk from the command line. The name stands for Aho, Weinberger and Kernighan (yes, Brian Kernighan), the authors of the language, which was started in 1977, hence it shares the same Unix spirit as the other classic *nix utilities. If you're getting used to C programming or know it already, you will see some familiar concepts in awk, especially since the 'k' in awk stands for the same person as the 'k' in K&R, the C programming bible. You will need some command-line knowledge in Linux and possibly some scripting basics, but the last part is optional, as we will try to offer something for everybody. Many thanks to Arnold Robbins for all his work involved in awk.
2. What is it that awk does?
awk is a utility/language designed for data extraction. If the word "extraction" rings a bell, it should because awk was one Larry Wall's inspirations when he created Perl. awk is often used with sed to perform useful and practical text manipulation chores, and it depends on the task if you should use awk or Perl, but also on personal preference. Just as sed, awk reads one line at a time, performs some action depending on the condition you give it and outputs the result. One of the most simple and popular uses of awk is selecting a column from a text file or other command's output. One thing I used to do with awk was, if I installed Debian on my second workstation, to get a list of the installed software from my primary box then feed it to aptitude. For that, I did something like that:
$ dpkg -l | awk ' {print $2} ' > installed
Most package managers today offer this facility, for example rpm's -qa options, but the output is more than I want. I see that the second column of dpkg -l 's output contains the name of the packages installed, so this is why I used $2 with awk: to get me only the 2nd column.
3. Basic concepts
As you have noticed, the action to be performed by awk is enclosed in braces, and the whole command is quoted. But the syntax is awk ' condition { action }'. In our example, we had no condition, but if we wanted to, say, check only for vim-related packages installed (yes, there is grep, but this is an example, plus why use two utilities when you can only use one?), we would have done this:
$ dpkg -l | awk ' /'vim'/ {print $2} '
This command would print all packages installed that have "vim" in their names. One thing that recommend awk is that it's fast. If you replace "vim" with "lib", on my system that yields 1300 packages. There will be situations where the data you'll have to work with will be much bigger, and that's one part where awk shines. Anyway, let's start with the examples, and we will explain some concepts as we go. But before that, it would be good to know that there are several awk dialects and implementations, and the examples presented here deal with GNU awk, as an implementation and dialect. And because of various quoting issues, we assume you're using bash, ksh or sh, we don't support (t)csh.
4. Examples
| Learning Linux awk command with examples | |
|---|---|
| Linux command syntax | Linux command description |
awk ' {print $1,$3} '
|
Print only columns one and three using stdin |
awk ' {print $0} '
|
Print all columns using stdin |
awk ' /'pattern'/ {print $2} '
|
Print only elements from column 2 that match pattern using stdin |
awk -f script.awk inputfile |
Just like make or sed, awk uses -f to get its' instructions from a file, useful when there is a lot to be done and using the terminal would be impractical |
awk ' program ' inputfile |
Execute program using data from inputfile |
awk "BEGIN { print \"Hello, world!!\" }"
|
Classic "Hello, world" in awk |
awk '{ print }'
|
Print what's entered on the command line until EOF (^D) |
#! /bin/awk -f |
awk script for the classic "Hello, world!" (make it executable with chmod and run it as-is) |
# This is a program that prints \ |
Comments in awk scripts |
awk -F "" 'program' files |
Define the FS (field separator) as null, as opposed to white space, the default |
awk -F "regex" 'program' files |
FS can also be a regular expression |
awk 'BEGIN { print "Here is a single \
|
Will print <'>. Here's why we prefer Bourne shells. :) |
awk '{ if (length($0) > max) max = \
|
Print the length of the longest line |
awk 'length($0) > 80' inputfile |
Print all lines longer than 80 characters |
awk 'NF > 0' data |
Print every line that has at least one field (NF stands for Number of Fields) |
awk 'BEGIN { for (i = 1; i <= 7; i++)
|
Print seven random numbers from 0 to 100 |
ls -l . | awk '{ x += $5 } ; END \
|
Print the total number of bytes used by files in the current directory |
ls -l . | awk '{ x += $5 } ; END \
|
Print the total number of kilobytes used by files in the current directory |
awk -F: '{ print $1 }' /etc/passwd | sort
|
Print sorted list of login names |
awk 'END { print NR }' inputfile
|
Print number of lines in a file, as NR stands for Number of Rows |
awk 'NR % 2 == 0' data |
Print the even-numbered lines in a file. How would you print the odd-numbered lines? |
ls -l | awk '$6 == "Nov" { sum += $5 }
|
Prints the total number of bytes of files that were last modified in November |
awk '$1 ̃/J/' inputfile |
Regular expression matching all entries in the first field that start with a capital j |
awk '$1 ̃!/J/' inputfile |
Regular expression matching all entries in the first field that don't start with a capital j |
awk 'BEGIN { print "He said \"hi!\" \
|
Escaping double quotes in awk |
echo aaaabcd | awk '{ sub(/a+/, \
|
Prints "<A>bcd" |
ls -lh | awk '{ owner = $3 ; $3 = $3 \
|
Attribution example; try it :) |
awk '{ $2 = $2 - 10; print $0 }' inventory
|
Modify inventory and print it, with the difference being that the value of the second field will be lessened by 10 |
awk '{ $6 = ($5 + $4 + $3 + $2); print \
|
Even though field six doesn't exist in inventory, you can create it and assign values to it, then display it |
echo a b c d | awk '{ OFS = ":"; $2 = ""
|
OFS is the Output Field Separator and the command will output "a::c:d" and "4" because although field two is nullified, it still exists so it gets counted |
echo a b c d | awk ’{ OFS = ":"; \
|
Another example of field creation; as you can see, the field between $4 (existing) and $6 (to be created) gets created as well (as $5 with an empty value), so the output will be "a::c:d::new" "6" |
echo a b c d e f | awk ’\ |
Throwing away three fields (last ones) by changing the number of fields |
FS=[ ] |
This is a regular expression setting the field separator to space and nothing else (non-greedy pattern matching) |
echo ' a b c d ' | awk 'BEGIN { FS = \
|
This will print only "a" |
awk -n '/RE/{p;q;}' file.txt
|
Print only the first match of RE (regular expression) |
awk -F\\\\ ’...’ inputfiles ... |
Sets FS to \\ |
BEGIN { RS = "" ; FS = "\n" }
|
If we have a record like "John Doe 1234 Unknown Ave. Doeville, MA", this script sets the field separator to newline so it can easily operate on rows |
awk ’BEGIN { OFS = ";"; ORS = "\n\n" }
|
With a two-field file, the records will be printed like this: "field1:field2 field3;field4 ...;..." |
awk ’BEGIN {
|
This will print 17 and 18, because the Output ForMaT is set to round floating point values to the closest integer value |
awk ’BEGIN {
|
You can use printf mainly how you use it in C |
awk ’{ printf "%-10s %s\n", $1, \
|
Prints the first field as a 10-character string, left-justified, and $2 normally, next to it |
awk ’BEGIN { print "Name Number"
|
Making things prettier |
awk ’{ print $2 > "phone-list" }' \
|
Simple data extraction example, where the second field is written to a file named "phone-list" |
awk ’{ print $1 > "names.unsorted"
|
Write the names contained in $1 to a file, then sort and output the result to another file (you can also append with >>, like you would in a shell) |
awk ’BEGIN { printf "%d, %d, %d\n", 011, 11, \
|
Will print 9, 11, 17 |
if (/foo/ || /bar/) |
Simple search for foo or bar |
awk ’{ sum = $2 + $3 + $4 ; avg = sum / 3
|
Simple arithmetic operations (most operators resemble C a lot) |
awk '{ print "The square root of", \
|
Simple, extensible calculator |
awk ’$1 == "start", $1 == "stop"’ inputfile |
Prints every record between start and stop |
awk ’ |
BEGIN and END rules are executed exactly once, before and after any record processing |
echo -n "Enter search pattern: " |
Search using shell |
if (x % 2 == 0) |
Simple conditional. awk, like C, also supports the ?: operators |
awk ’{ i = 1
|
Prints the first three fields of each record, one per line. |
awk ’{ for (i = 1; i <= 3; i++)
|
Prints the first three fields of each record, one per line. |
BEGIN {
|
Exiting with an error code different from 0 means something's not quite right. Here's and example |
awk ’BEGIN {
|
Prints awk file1 file2 |
for (i in frequencies) |
Delete elements in an array |
foo[4] = "" |
Check for array elements |
function ctime(ts, format) |
An awk variant of ctime() in C. This is how you define your own functions in awk |
BEGIN { _cliff_seed = 0.1 }
|
A Cliff random number generator |
cat apache-anon-noadmin.log | \ |
Anonymize an Apache log (IPs are randomized) |
5. Conclusion
As you can see, with awk you can do lots of text processing and other nifty stuff. We didn't get into more advanced topics, like awk's predefined functions, but we showed you enough (we hope) to start remembering it as a powerful tool.
Learning Linux Commands: awk--reference的更多相关文章
- 10 Linux Commands Every Developer Should Know
转载:http://azer.bike/journal/10-linux-commands-every-developer-should-know/ As a software engineer, l ...
- 【转】如何利用多核CPU来加速你的Linux命令 — awk, sed, bzip2, grep, wc等
如何利用多核CPU来加速你的Linux命令 — awk, sed, bzip2, grep, wc等 你是否曾经有过要计算一个非常大的数据(几百GB)的需求?或在里面搜索,或其它操作——一些无法并 ...
- 性能工具之linux三剑客awk、grep、sed详解
前言 linux 有很多工具可以做文本处理,例如:sort, cut, split, join, paste, comm, uniq, column, rev, tac, tr, nl, pr, he ...
- linux之awk
相较于sed 常常作用于一整个行的处理,awk 则比较倾向于一行当中分成数个『字段』来处理. 因此,awk 相当的适合处理小型的数据数据处理呢! awk 通常运作的模式是这样的: [root@linu ...
- Linux下undefined reference to ‘pthread_create’问题解决
Linux下undefined reference to 'pthread_create'问题解决 在试用Linux 线程模块时,试用pthread_create 函数. 编译命令为 gcc main ...
- linux中awk的使用
在linux中awk绝对是核心工具,特别是在查找搜索这一领域,和掌握sed命令一样重要 下面为awk的一些基本知识,基于这些知识,可以让你随意操控一个文件: 在awk中:()括号为条件块,{}为执行的 ...
- linux 的 awk 使用
linux中awk命令对文本内容进行操作,其功能十分强大 1.如:查看一个有几百万行内容的文件中第3列数字内容(不重复) cat test.csv | awk -F ',' '{print $3}' ...
- Linux中awk后面的RS, ORS, FS, OFS 用法
Linux中awk后面的RS, ORS, FS, OFS 含义 一.RS 与 ORS 差在哪 我们经常会说,awk是基于行列操作文本的,但如何定义“行”呢?这就是RS的作用. 默认情况下,RS的 ...
- [转帖]Linux中awk工具的使用
Linux中awk工具的使用 2018年10月09日 17:26:20 谢公子 阅读数 2170更多 分类专栏: linux系统安全 版权声明:本文为博主原创文章,遵循CC 4.0 BY-SA版权 ...
随机推荐
- [SIP01]SIP Header Fields里面各字段用途
INVITE sip:bob@biloxi.com SIP/2.0 Via: SIP/2.0/UDP pc33.atlanta.com;branch=z9hG4bK776asdhds Max-Forw ...
- Split 之特殊用法
java中split()特殊符号"." "|" "*" "\" "]" 关于点的问题是用stri ...
- 谈谈iOS开发如何写个人中心这类页面--静态tableView页面的编写
本文来自 网易云社区 . 一.本文讲的是什么问题? 在开发 iOS 应用时,基本都会遇到个人中心.设置.详情信息等页面,这里截取了某应用的详情编辑页面和个人中心页面,如下: 我们以页面结构的角度考虑这 ...
- [BJOI2010] 严格次小生成树
题目链接 一个严格次小生成树的模板题. 看到次小生成树,我们有一个很直观的想法就是先构造出来最小生成树,然后将这个最小生成树上面最大的一条边替换成和它值最相近而且比他大的边. 那么首先就是用krusk ...
- django重写form表单中的局部钩子函数
from django import forms from django.core.exceptions import ValidationError from jax import models c ...
- jvisualvm_使用jmx连接远程linux应用
[前提] JVisualVM是由Sun提供的性能分析工具,在Jdk6.0以后的版本中是自带的,如果是用Jdk1.5或以前版本的就得要单独安装了. [1]远程机器需要开启jmx 在使用jvisualvm ...
- 洛谷T46780 ZJL 的妹子序列(生成函数)
题面 传送门 题解 这居然是一道语文题? 首先不难看出,因为每一次相邻元素交换最多减少一个逆序对,所以至少\(m\)次交换就代表这个序列的逆序对个数为\(m\) 我们考虑一下,假设现在已经放完了\(i ...
- [ActionSprit 3.0] FMS客户端调用服务器端命令或方法
有时候客户端需要和服务器端进行通信,服务器端会有个main.asc文件(当然,文件名可以自己定义),这个就是服务器端程序,是在服务器上执行的,你可以用trace调试,trace的内容会在管理服务器的页 ...
- [ActionScript 3.0] 十进制与二进制,十六进制等数据之间的相互转换
将十进制转换为二进制,方法是:将数字除以2,根据余数来从右往左排列二进制的位数,如下以十进制数10为例 10除以2得5,余数为0,故第一个位置为0: 5除以2得2,余数为1,故第二个位置为1: 2除以 ...
- swift 的基本类型之字符串
一:创建字符串 //字符串的创建有两种 //不可变字符串 let str = "I'm a string" //可变字符串 var string = "I'm a mut ...