shell编程系列21--文本处理三剑客之awk中数组的用法及模拟生产环境数据统计

shell编程系列21--文本处理三剑客之awk中数组的用法及模拟生产环境数据统计

shell中的数组的用法：

    shell数组中的下标是从0开始的

    array=("Allen" "Mike" "Messi" "Jerry" "Hanmeimei" "Wang")

    打印元素：        echo ${array[]}

    打印元素个数:        echo ${#array[@]}

    打印某个元素长度:    echo ${#array[]}

    给元素赋值：    array[]=ui;

    删除元素:    unset array[];unset array # 删除数组

    分片访问:    echo ${array[@]::}

    元素内容替换:    ${array[@]/e/E} 只替换第一个e；${array[@]//e/E} 替换所有的e

    数组的遍历：

        for a in ${array[@]}

        do

            echo $a

        done

    awk中数组的用法：

        在awk中，使用数组时，不仅可以使用1...n作为数组小标，也可以使用字符串作为数组下标

    典型常用例子：

        、统计主机上所有的tcp链接状态数，按照每个tcp状态分类

            # netstat -an | grep tcp | awk '{arr[$6]++}END{for (i in arr) print i,arr[i]}'

            LAST_ACK

            LISTEN

            SYN_RECV

            ESTABLISHED

            FIN_WAIT1

            FIN_WAIT2

            CLOSING

            TIME_WAIT 

        、计算横向数据综合，计算纵向数据总和

            Allen

            Mike

            Zhang

            Jerry

            Han

            Li                

        # 代码如下：

        [root@localhost shell]# awk -f statics.awk student.txt

        Name                          Yuwen                         Math                          English                       Physical                      total

        Allen

        Mike

        Zhang

        Jerry

        Han

        Li

        every_total

        [root@localhost shell]# cat statics.awk

        BEGIN{

            printf "%-30s%-30s%-30s%-30s%-30s%-30s\n","Name","Yuwen","Math","English","Physical","total"

        }

        {

            total=$+$+$+$

            yuwen_sum+=$

            math_sum+=$

            english_sum+=$

            physical_sum+=$

            printf "%-30s%-30d%-30d%-30d%-30d%-30d\n",$,$,$,$,$,total

        }

        END{

            printf "%-30s%-30d%-30d%-30d%-30d\n","every_total",yuwen_sum,math_sum,english_sum,physical_sum

        }

计算字符串的长度：

[root@localhost shell]# str="test string"

[root@localhost shell]# echo $str

test string

[root@localhost shell]# echo ${#str}

# 修改数组元素

array=("Allen" "Mike" "Messi" "Jerry" "Hanmeimei" "Wang")

[root@localhost shell]# array[]="Jerry"

[root@localhost shell]# echo ${array[@]}

Allen Jerry Messi Jerry Hanmeimei Wang

# 删除第3个元素

[root@localhost shell]# echo ${array[@]}

Allen Jerry Messi Jerry Hanmeimei Wang

[root@localhost shell]#

[root@localhost shell]# unset array[];

[root@localhost shell]# echo ${array[@]}

Allen Jerry Jerry Hanmeimei Wang

# 在数组中删除下标为1的元素，即Mike被删除，再次删除下标为1的元素，发现数组不变，说明数组虽然删除了元素，下标还是不变保存在内存中

[root@localhost shell]# array=("Allen" "Mike" "Messi" "Jerry" "Hanmeimei" "Wang")

[root@localhost shell]# unset array[]

[root@localhost shell]# echo ${array[*]}

Allen Messi Jerry Hanmeimei Wang

[root@localhost shell]# unset array[]

[root@localhost shell]# echo ${array[*]}

Allen Messi Jerry Hanmeimei Wang

# 分片访问，数组为1的开始遍历3个元素

[root@localhost shell]# array=("Allen" "Mike" "Messi" "Jerry" "Hanmeimei" "Wang")

[root@localhost shell]# echo ${array[@]::}

Mike Messi Jerry

# 1到最后

[root@localhost shell]# echo ${array[@]:}

Mike Messi Jerry Hanmeimei Wang

#替换1个，替换所有

[root@localhost shell]# echo ${array[@]}

Allen Mike Messi Jerry Hanmeimei Wang

[root@localhost shell]# echo ${array[@]/e/E}

AllEn MikE MEssi JErry HanmEimei Wang

[root@localhost shell]# echo ${array[@]//e/E}

AllEn MikE MEssi JErry HanmEimEi Wang

# 遍历数组

[root@localhost shell]# for a in ${array[@]};do echo $a;done

Allen

Mike

Messi

Jerry

Hanmeimei

Wang

    计算横向和、纵向和

    Allen

    Mike

    Zhang

    Jerry

    Han

    Li                

    [root@localhost shell]# awk -f stu.awk student.txt

    Name                Yuwen               Math                English             Physical            total

    Allen

    Mike

    Zhang

    Jerry

    Han

    Li

    every_total

    [root@localhost shell]# cat stu.awk

    BEGIN{

        printf "%-20s%-20s%-20s%-20s%-20s%-20s\n","Name","Yuwen","Math","English","Physical","total"

    }

    {    total=$+$+$+$

        yunwen_sum+=$

        math_sum+=$

        english_sum+=$

        physical_sum+=$

        printf "%-20s%-20d%-20d%-20d%-20d%-20d\n",$,$,$,$,$,total

    }

    END{

        printf "%-20s%-20d%-20d%-20d%-20d\n","every_total",yunwen_sum,math_sum,english_sum,physical_sum

    }

# 模拟生产环境数据脚本

[root@localhost shell]# cat insert.sh

#!/bin/bash

#

function create_random()

{

    min=$

    max=$(($-$min+))

    num=$(date +%s%N)

    echo $(($num%$max+$min))

}

INDEX=

while true

do

    for user in Allen Mike Jerry Tracy Hanmeimei Lilei

    do

        COUNT=$RANDOM

        NUM1=`create_random  $COUNT`

        NUM2=`expr $COUNT - $NUM1`

        echo "`date '+%Y-%m-%d %H:%M:%S'` $INDEX Batches: user:$user insert $COUNT records into datebase:product table:detail, insert $NUM1 records successfully,

        failed $NUM2 records" >> ./db.log.`date +%Y%m%d`

        INDEX=`expr $INDEX + `

    done

done

数据格式如下：

db.log.

-- ::  Batches: user Jerry insert  records into datebase:product table:detail, insert  records successfully,failed  records

-- ::  Batches: user Tracy insert  records into datebase:product table:detail, insert  records successfully,failed  records

-- ::  Batches: user Hanmeimei insert  records into datebase:product table:detail, insert  records successfully,failed  records

-- ::  Batches: user Lilei insert  records into datebase:product table:detail, insert  records successfully,failed  records

-- ::  Batches: user Allen insert  records into datebase:product table:detail, insert  records successfully,failed  records

...

、统计每个人分别插入了多少条record进数据库

    输出结果示例：

    Name    totalrecords

    allen

    mike    

    [root@localhost shell]# awk -f exam1.awk  db.log.

    User                Total records

    Jerry

    Mike

    Lilei

    Hanmeimei

    Tracy

    Allen

    [root@localhost shell]# cat exam1.awk

    BEGIN{

        printf "%-20s%-20s\n","User","Total records"

    }

    {

        USER[$]+=$

    }

    END{

        for(u in USER)

            printf "%-20s%-20d\n",u,USER[u]

    }

、统计每个人分别插入成功了多少record，失败了多少record

    输出结果：

    User    Success_record    failed_record

    jerry            

    success $

    failed    $

    [root@localhost shell]# cat exam2.awk

    BEGIN{

        printf "%-30s%-30s%-30s\n","User","Success records","Failed records"

    }

    {

        SUCCESS[$]+=$

        FAILED[$]+=$

    }

    END{

        for(u in SUCCESS)

            printf "%-30s%-30d%-30d\n",u,SUCCESS[u],FAILED[u]

    }

    [root@localhost shell]# awk -f exam2.awk db.log.

    User                          Success records               Failed records

    Jerry

    Mike

    Lilei

    Hanmeimei

    Tracy

    Allen                                                

、将例子1和例子2结合起来，一起输出，输出每个人分别插入多少条数据，多少成功，多少失败，并且要格式化输出，加上标题

    输出结果：

    User    Total        success        failed

    tracy

    allen                

    代码：

    [root@localhost shell]# cat exam3.awk

    BEGIN{

        printf "%-30s%-30s%-30s%-30s\n","Name","total records","success records","failed records"

    }

    {

        TOTAL_RECORDS[$]+=$

        SUCCESS[$]+=$

        FAILED[$]+=$

    }

    END{

        for(u in TOTAL_RECORDS)

            printf "%-30s%-30d%-30d%-30d\n",u,TOTAL_RECORDS[u],SUCCESS[u],FAILED[u]

    }

    [root@localhost shell]# awk -f exam3.awk db.log.

    Name                          total records                 success records               failed records

    Jerry

    Mike

    Lilei

    Hanmeimei

    Tracy

    Allen                                                                      

、在例子3的基础上，加上结尾，统计全部插入记录数，成功记录数，失败记录数

    输出结果：

    User    Total        success        failed

    tracy

    allen                

    方法1：

    [root@localhost shell]# cat exam4_b.awk

    BEGIN{

        printf "%-30s%-30s%-30s%-30s\n","Name","total records","success records","failed records"

    }

    {

        TOTAL_RECORDS[$]+=$

        SUCCESS[$]+=$

        FAILED[$]+=$

    }

    END{

        for(u in TOTAL_RECORDS)

        {

            # 在统计出的结果数组中进行累加

            records_sum+=TOTAL_RECORDS[u]

            success_sum+=SUCCESS[u]

            failed_sum+=FAILED[u]

            printf "%-30s%-30d%-30d%-30d\n",u,TOTAL_RECORDS[u],SUCCESS[u],FAILED[u]

        }

        printf "%-30s%-30d%-30d%-30d\n","",records_sum,success_sum,failed_sum

    }

    [root@localhost shell]# awk -f exam4_b.awk db.log.

    Name                          total records                 success records               failed records

    Jerry

    Mike

    Lilei

    Hanmeimei

    Tracy

    Allen                                                                                           

    方法2：

    [root@localhost shell]# cat exam4.awk

    BEGIN{

        printf "%-30s%-30s%-30s%-30s\n","Name","total records","success records","failed records"

    }

    {

        RECORDS[$]+=$

        SUCCESS[$]+=$

        FAILED[$]+=$

        # 在原始数据中进行汇总计算

        records_sum+=$

        success_sum+=$

        failed_sum+=$

    }

    END{

        for(u in RECORDS)

            printf "%-30s%-30d%-30d%-30d\n",u,RECORDS[u],SUCCESS[u],FAILED[u]

        printf "%-30s%-30d%-30d%-30d\n","total",records_sum,success_sum,failed_sum

    }

    [root@localhost shell]# awk -f exam4.awk db.log.

    Name                          total records                 success records               failed records

    Jerry

    Mike

    Lilei

    Hanmeimei

    Tracy

    Allen

    total                                                                    

、查找丢失数据的现象，也就是成功+失败的记录数不等于一共插入的记录数，找出这些数据并显示行号和对应行的日志信息

    输出结果：

    代码：

        [root@localhost shell]# awk '{if($8!=$14+$17) print NR,$0}' db.log.

         -- ::  Batches: user Hanmeimei insert  records into datebase:product table:detail, insert  records successfully,failed  records

         -- ::  Batches: user Mike insert  records into datebase:product table:detail, insert  records successfully,failed  records

    # 写入文件的方式

    [root@localhost shell]# awk -f exam5.awk db.log.

     -- ::  Batches: user Hanmeimei insert  records into datebase:product table:detail, insert  records successfully,failed  records

     -- ::  Batches: user Mike insert  records into datebase:product table:detail, insert  records successfully,failed  records

    [root@localhost shell]# cat exam5.awk

    BEGIN{

    }

    {

        if($!=$+$)

        print NR,$

    }

shell编程系列21--文本处理三剑客之awk中数组的用法及模拟生产环境数据统计的更多相关文章

shell编程系列19--文本处理三剑客之awk中的字符串函数
shell编程系列19--文本处理三剑客之awk中的字符串函数字符串函数对照表(上) 函数名解释函数返回值 length(str) 计算字符串长度整数长度值 index(str1,str2) ...
shell编程系列18--文本处理三剑客之awk动作中的条件及if/while/do while/for循环语句
shell编程系列18--文本处理三剑客之awk动作中的条件及if/while/do while/for循环语句条件语句 if(条件表达式) 动作1 else if(条件表达式) 动作2 else 动 ...
shell编程系列14--文本处理三剑客之awk的概述及常用方法总结
shell编程系列14--文本处理三剑客之awk的概述及常用方法总结 awk是一个文本处理工具,通常用于处理数据并生成结果报告 awk的命名是它的创始人 Alfred Aho.Peter Weinbe ...
shell编程系列9--文本处理三剑客之sed概述及常见用法总结
shell编程系列9--文本处理三剑客之sed概述及常见用法总结 sed的工作模式:对文本的行数据一行行处理,如下图 sed(stream editor),是流编辑器,依据特定的匹配模式,对文本逐行匹 ...
shell编程系列20--文本处理三剑客之awk常用选项
shell编程系列20--文本处理三剑客之awk常用选项 awk选项总结选项解释 -v 参数传递 -f 指定脚本文件 -F 指定分隔符 -V 查看awk的版本号 [root@localhost s ...
shell编程系列17--文本处理三剑客之awk动作中的表达式用法
shell编程系列17--文本处理三剑客之awk动作中的表达式用法 awk动作表达式中的算数运算符 awk动作中的表达式用法总结: 运算符含义 + 加 - 减 * 乘 / 除 % 模 ^或** 乘方 ...
shell编程系列16--文本处理三剑客之awk模式匹配的两种方法
shell编程系列16--文本处理三剑客之awk模式匹配的两种方法 awk的工作模式第一种模式匹配:RegExp 第二种模式匹配:关系运算匹配用法格式对照表语法格式含义 RegExp 按正则表 ...
shell编程系列15--文本处理三剑客之awk格式化输出printf
shell编程系列15--文本处理三剑客之awk格式化输出printf printf的格式说明符格式符含义 %s 打印字符串 %d 打印十进制数 %f 打印一个浮点数 %x 打印十六进制数 %o ...
shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容
shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容删除命令对照表命令含义 1d 删除第一行内容 ,10d 删除1行到10行的内容 ,+5d 删除10行到16行的内容 /p ...

随机推荐

Systemweaver — 电子电气协同设计研发平台
当前电子电气系统随着功能安全.AutoSAR.车联网.智能驾驶等新要求,导致其复杂性.关联性日益上升.当前,传统基于文档的设计由于其低复用性.无关联性.无协同性等缺点,已经无法适应日益 ...
suse双网卡绑定
这里使用两张网卡eth1.eth2进行编辑/etc/sysconfig/network/ifcfg-bond0文件(此文件没有需要创建) device='bond0' BOOTPROTO='stat ...
Microsoft Onenote shortcuts / Onenote快捷键大全
Onenote跟Libreoffice ,有道笔记比起来一个快捷键特别不太好用,就是Ctrl + Shift +v 并不会提供一个选择粘贴模式选项. 而在Onenote 中 Ctrl + Shift ...
Python 过滤a文件中每一行内容,保存到b文件中
#coding=utf-8print 1#初始化文件crash_log.log with open('e:/1/crash_log.log','w')as f: f.close() def fw(se ...
Redis 缓存雪崩、穿透、击穿
缓存雪崩定义: 同一时间所有 key 大面积失效,比如网站首页的数据基本上都是同一批次去缓存的. 解决方法: ① 存的时候设定随机的失效时间. ② 服务做熔断处理(异常或着慢查询 Hystrix 限 ...
test20190829 神大校赛模拟
100+100+0=200,聪明搬题人题面又出锅了. 最短路径(path) 给定有向图,包含 n 个节点和 m 条有向边. 一条A 到 B 的路径是最短路径当且仅当不存在另一条从A 到 B 的路径比它 ...
Mybatis 不支持通配符扫包起别名问题
typeAliasesPackage 默认只能扫描某一个路径下,或以逗号等分割的几个路径下的内容,不支持通配符和正则,采用重写的方式解决 package com.xxxx.xxx.util.comm ...
Python函数的基本使用
在编程中,无论使用什么编程语言,函数的使用都是非常广泛的,函数能够完成特定的功能,降低编程的难度和代码重用. 1.函数的定义: 函数是一段具有特定功能的.可重用的语句组,用函数名来表示并通过函数名进 ...
zjoj1706: [usaco2007 Nov]relays 奶牛接力跑
矩阵乘法(快速幂) 为说明方便,这里让\(k\)为点数,\(n\)为路径长度. 先将点都离散化,这样最后的点只有\(2k\)个. 先考虑一种暴力,每次用\(O(k^3)\)的复杂度来暴力更新,设当前长 ...
Linux rpm安装指定安装路径
可以使用prefix参数. rpm -i –prefix=/home/gpadmin greenplum-db-6.0.0-rhel6-x86_64.rpm 将greenplum-db-6.0. ...

shell编程系列21--文本处理三剑客之awk中数组的用法及模拟生产环境数据统计

shell编程系列21--文本处理三剑客之awk中数组的用法及模拟生产环境数据统计的更多相关文章

随机推荐

热门专题