gawk用法
?
?
一、awk概述
二、gawk工作機制
三、gawk命令介紹
?
?
一、awk概述
?
AWK的名字來自于其創始人Aho, Weinberger, Kernihan三人的名字首字母的組合。
?
awk是一個功能非常強大的文本處理工具,它能把文本當做數據庫,然后把數據庫中的每一行切分為多個字段,可以實現分別對多個字段或行進行處理,并使之按照一定的格式輸出,所以awk是文本報告生成器,它能格式化文本。而要選擇性地處理字段或行,則可以使用模式(PATTERN)來匹配。
?
此外,awk已經是一門獨立完整的編程語言,它支持一般編程語言所具備的特性,如支持變量、數組等數據結構、支持條件判斷及循環等功能、具有內置函數等。而正如前面所說,awk的主要是作為文本報告生成器來使用的。
?
gawk是GNU project的awk解釋器的開放源代碼實現,而nawk (new awk)則是20世紀80年代發展起來的新版本,不管是gawk還是nawk都是在其舊版本oawk (old awk)完善一些功能特性而來的。因為大家更傾向于使用awk及gawk,因此本文介紹的是gawk。
?
二、gawk工作機制
?
awk的工作流程首先是先讀取文本文件中的一行,并對這一行切分為多個字段,將每個字段都存放至awk的內置變量($1,$2,$3,...)中,而當前處理的一行則存放在awk內置變量$0中。接著awk根據用戶指定的模式(PATTERN),分別對行或每一行中的字段進行匹配,并根據用戶指定的動作語句(Action)對匹配到的行或字段進行加工處理;最后awk會將加工處理的結果默認輸出至標準輸出,并開始讀取文本文件的下一行進行處理,以此類推。
?
awk就是這樣來實現強大的文本處理功能的。不難想到,用戶可以通過模式匹配(PATTERN)選擇要處理的行或字段,而在動作語句(Action)中指明要如何加工處理數據、數據以什么格式輸出等,如果有需要還可以利用Action中的條件判斷語句作進一步選擇要處理和輸出的數據,也可以通過循環語句實現對每一行字段間的遍歷或數組的各個元素的遍歷等。需要注意的是,awk具有內生循環,因此會自動遍歷文本文件中的每一行。
?
三、gawk命令介紹
?
命令簡述:
gawk - pattern scanning and processing language
gawk是文本處理工具,支持模式掃描,是一門編程語言。
????
語法格式:
gawk [options] 'program' FILE ...
?
常用選項:
-F:用于指定輸入時用到的字段分隔符;
-v var=value:自定義變量;
?
program:
program:PATTERN{ACTION STATEMENTS}
①PATTERN:模式
②ACTION STATEMENTS:動作語句
?
PATTERN:
(1) empty:空模式,匹配每一行;
(2) /regular expression/:僅處理被此處的模式(正則表達式)所匹配到的行;
(3) relational expression:關系表達式;結果有“真”有“假”,結果為“真”時才執行;
(4) lines ranges:指定行范圍;
(5) BEGIN/END模式:
①BEGIN{}:僅在開始處理文件中的文本之前執行一次;
②END{}:僅在文本處理完成之后、命令結束之前執行一次;
?
?
我們從netstat命令中提取了如下信息作為使用示例:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># cat netstat.txt </span> Proto Recv-Q Send-Q Local-Address Foreign-Address State tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 0 10.10.10.140:49808 10.10.10.140:80 TIME_WAIT tcp 0 52 10.10.10.140:22 10.10.10.1:52641 ESTABLISHED tcp 0 0 10.10.10.140:22 10.10.10.1:51926 ESTABLISHED tcp 0 0 10.10.10.140:22 10.10.10.1:52640 ESTABLISHED tcp 0 0 10.10.10.140:49806 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49804 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49810 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49812 10.10.10.140:80 TIME_WAIT</code></span></span></span>?
要點:
1.?$1,$2,...$n分別表示當前處理的一行的第1個、第2個...第n個字段。
2. 使用print語句時,在命令行可用逗號分隔各個字段,在輸出時默認以空白字符作為分隔符。
?
(1) empty:空模式,匹配每一行
示例:
顯示第1列、第4列:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{print $1,$4}' netstat.txt </span> Proto Local-Address tcp 0.0.0.0:22 tcp 127.0.0.1:25 tcp 10.10.10.140:49808 tcp 10.10.10.140:22 tcp 10.10.10.140:22 tcp 10.10.10.140:22 tcp 10.10.10.140:49806 tcp 10.10.10.140:49804 tcp 10.10.10.140:49810 tcp 10.10.10.140:49812</code></span></span></span>因為是空模式,所以會匹配每一行。
?
?
(2) /regular expression/:僅處理被此處的模式(正則表達式)所匹配到的行;
示例:
顯示netstat.txt文件中以tcp開頭的行:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '/^tcp\>/' netstat.txt </span> tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 0 10.10.10.140:49808 10.10.10.140:80 TIME_WAIT tcp 0 52 10.10.10.140:22 10.10.10.1:52641 ESTABLISHED tcp 0 0 10.10.10.140:22 10.10.10.1:51926 ESTABLISHED tcp 0 0 10.10.10.140:22 10.10.10.1:52640 ESTABLISHED tcp 0 0 10.10.10.140:49806 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49804 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49810 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49812 10.10.10.140:80 TIME_WAIT</code></span></span></span>?
顯示/etc/fstab文件中以'UUID'開頭的行的第一個字段:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '/^UUID/{print $1}' /etc/fstab </span> UUID<span style="color:#9a6e3a">=</span>60eb0d1c-9834-4348-9a79-2f91983a8ede UUID<span style="color:#9a6e3a">=</span>3995456f-a3e9-4b69-a0de-0fd48068da39 UUID<span style="color:#9a6e3a">=</span>01d24b59-eee4-472e-9985-ef44fd5e059c <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span></span>?
顯示/etc/fstab文件中不以'UUID'開頭的行的第一個字段:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '!/^UUID/{print $1}' /etc/fstab </span><span style="color:slategray">#</span> <span style="color:slategray">#</span> <span style="color:slategray">#</span> <span style="color:slategray">#</span> <span style="color:slategray">#</span> <span style="color:slategray">#</span> <span style="color:slategray">#</span> tmpfs devpts sysfs proc /dev/sr0 <span style="color:slategray">#/dev/md0</span> <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span></span>注意:對過濾模式取反,只需在模式前加上'!'即可。
?
?
(3) relational expression:關系表達式;結果有“真”有“假”,結果為“真”時才執行
?
什么才是“真”?結果為非0值、非空字符串,即為真。
?
示例:
顯示/etc/passwd中最后一個字段為'/bin/bash'的用戶的用戶名、id及shell:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash">方式一: <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '$NF=="/bin/bash"{print $1,$3,$NF}' /etc/passwd</span> root 0 /bin/bash logstash 500 /bin/bash centos 501 /bin/bash方式二: <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '$NF~/\/bin\/bash$/{print $1,$3,$NF}' /etc/passwd</span> root 0 /bin/bash logstash 500 /bin/bash centos 501 /bin/bash</code></span></span></span>?
(4) lines ranges:指定行范圍
格式:
startline,endline:/pat1/,/pat2/
?
注意:不支持直接給出數字的格式,只能使用模式進行匹配;
?
示例:
顯示/etc/passwd文件中第2行到第10行的第一個字段:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '(NR>=2&&NR<=10){print $1}' /etc/passwd</span> bin daemon adm lp <span style="color:#dd4a68">sync</span> <span style="color:#dd4a68">shutdown</span> halt mail uucp <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span></span>錯誤示例:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '2,10{print $1}' /etc/passwd</span></code></span></span></span>注意這是錯誤寫法!因為不能直接給出行數!
?
(5) BEGIN/END模式:
①BEGIN{}:僅在開始處理文件中的文本之前執行一次;
②END{}:僅在文本處理完成之后、命令結束之前執行一次;
?
示例:
指定分隔符為':':
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{FS=":"}{print $1,$3,$6}' /etc/passwd</span> root 0 /root bin 1 /bin daemon 2 /sbin adm 3 /var/adm lp 4 /var/spool/lpd <span style="color:#dd4a68">sync</span> 5 /sbin <span style="color:#dd4a68">shutdown</span> 6 /sbin halt 7 /sbin mail 8 /var/spool/mail</code></span></span></span>?
統計出當前主機的tcp連接狀態為"LISTEN"的連接數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># netstat -tan</span> Active Internet connections <span style="color:#999999">(</span>servers and established<span style="color:#999999">)</span> Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 52 10.10.10.140:22 10.10.10.1:51926 ESTABLISHED tcp6 0 0 :::80 :::* LISTEN tcp6 0 0 :::22 :::* LISTEN tcp6 0 0 ::1:25 :::* LISTEN <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># </span> <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># netstat -tan | awk '$NF=="LISTEN"{i++}END{print i}' </span> 5 <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span></span>?
說到內建變量,我們來看一看awk的內建變量吧:
?
| FS | input field seperator,輸入時的字段分隔符,默認為空白字符 |
| OFS | output field seperator,輸出時的字段分隔符,默認為空白字符 |
| RS | input record seperator,輸入時的換行符,默認為\n |
| ORS | output record seperator,輸出時的換行符,默認為\n |
| NF | number of field,每一行的字段數量 |
| NR | number of record,行數,顯示為當前處理的行的行號 |
| FNR | file number of record,行數,各個文件分開進行計數 |
| FILENAME | 當前文件名 |
| ARGC | 命令行給定的參數個數 |
| ARGV | 數組,保存的是命令行給定的各參數 |
?
示例:
顯示/etc/passwd文件中每一行的第一個字段:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -v FS=":" '{print $1}' /etc/passwd</span> root bin daemon adm lp <span style="color:#dd4a68">sync</span> <span style="color:#dd4a68">shutdown</span> halt</code></span></span></span>顯示效果與直接用-F選項指出分隔符為':'相同。
?
顯示/etc/passwd文件中每一行的第1、3、7個字段,并且每個字段之間用冒號連接:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -v FS=":" -v OFS=":" '{print $1,$3,$7}' /etc/passwd</span> root:0:/bin/bash bin:1:/sbin/nologin daemon:2:/sbin/nologin adm:3:/sbin/nologin lp:4:/sbin/nologin sync:5:/bin/sync shutdown:6:/sbin/shutdown halt:7:/sbin/halt</code></span></span></span>?
顯示/etc/fstab文件中每一行的字段數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{print NF}' /etc/fstab </span> 0 1 2 10 1 9 12 1 6 6 6 6</code></span></span></span>?
顯示netstat.txt文件中每一行的最后一個字段:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{print $NF}' netstat.txt </span> State LISTEN LISTEN TIME_WAIT ESTABLISHED ESTABLISHED ESTABLISHED TIME_WAIT TIME_WAIT TIME_WAIT TIME_WAIT</code></span></span></span>這里需要注意NF和$NF的區別!我們說過,在awk中引用變量不需要帶上"$",即便是$1,$2,...也只是awk引用每一行中各個字段的專用符號,這里NF表示當前處理的行的字段數量,$NF則是將NF(即字段數量)的值作為"$#"中的數字"#",從而相當于使用awk的專用符號!
?
?
顯示netstat.txt文件中每一行的行號:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{print NR}' netstat.txt </span> 1 2 3 4 5 6 7 8 9 10 11</code></span></span></span>?
如果對一個文件的每一行進行顯示的話,則會顯示每一行的行號;而在整個文件處理結束之后再進行顯示時,則顯示的是文件的行號,如下:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'END{print NR}' netstat.txt </span> 11</code></span></span></span>?
顯示文件/etc/fstab和/etc/inittab的行數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{print NR}' /etc/fstab /etc/inittab </span> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29</code></span></span></span>可以發現,這里將兩個文本文件當做一個文件來處理了,而如果需要將兩個或多個文件分開進行計數時,則可使用內建變量NFR:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{print FNR}' /etc/fstab /etc/inittab </span> 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17</code></span></span></span>?
顯示命令行給定的參數個數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{print ARGC}' /etc/fstab /etc/inittab </span> 3</code></span></span></span>?
在命令行中給定的各個參數會存放在awk的內建數組ARGV中,可使用ARGV[#]引用各個參數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{print ARGV[0]}' /etc/fstab /etc/inittab </span> <span style="color:#dd4a68">awk</span> <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{print ARGV[1]}' /etc/fstab /etc/inittab </span> /etc/fstab <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{print ARGV[2]}' /etc/fstab /etc/inittab </span> /etc/inittab</code></span></span></span>注意:這里第一個參數是awk,而不是'program'!
?
?
除了內建變量之外,awk還支持用戶自定義變量,而自定義變量有兩種方式:
(1) 通過選項-v var=value定義變量;
(2) 在program中直接定義;
?
命令演示:
也可通過選項直接定義:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -v test='hello awk' '{print test}' netstat.txt </span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span> hello <span style="color:#dd4a68">awk</span></code></span></span></span>文件有多少行,就顯示多少個'hello awk';這里引用變量不需要'$',再次強調!
?
如果不需要對各行處理,而僅顯示一次,則:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -v test='hello awk' 'BEGIN{print test}' netstat.txt </span> hello <span style="color:#dd4a68">awk</span></code></span></span></span>?
在program中定義:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{test="hello"; print test}'</span> hello</code></span></span></span>?
?
接下來介紹awk常用的Action:
?
(1) pirnt語句:
格式:
print item1, item2, ...
?
要點:
①在命令中以逗號作為分隔符,輸出時默認以空白字符作為分隔符;
②這里輸出的各item可以是字符串、數值、當前記錄的字段、變量、數組以及awk的表達式等;
③如果輸出的數值,則數值會隱射為字符串后輸出,而在計算時仍為數值;
④如果省略item,輸出效果相當于'print $0'.
?
示例:
顯示/etc/fstab文件中最后4行中每一行的第2、4個字段:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># tail -4 /etc/fstab | awk '{print $2,$4}'</span> / defaults /boot defaults /home defaults swap defaults</code></span></span></span>注意:如果在命令行中沒有使用逗號分隔,則會被awk認為這是一個字段,并連接起來,如下:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># tail -4 /etc/fstab | awk '{print $2 $4}'</span> /defaults /bootdefaults /homedefaults swapdefaults</code></span></span></span>?
在顯示的字段前插入字符串:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># tail -4 /etc/fstab | awk '{print "hello:",$2,$4}'</span> hello: / defaults hello: /boot defaults hello: /home defaults hello: swap defaults</code></span></span></span>?
需要注意的是,只有在引號之外才可以做變量替換,因此在awk中要做變量替換時,不能用引號將之引起來,這里舉一個例子:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># tail -4 /etc/fstab | awk '{print "hello:",$2}'</span> hello: / hello: /boot hello: /home hello: swap <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># tail -4 /etc/fstab | awk '{print "hello:,$2"}'</span> hello:,<span style="color:#ee9900">$2</span> hello:,<span style="color:#ee9900">$2</span> hello:,<span style="color:#ee9900">$2</span> hello:,<span style="color:#ee9900">$2</span></code></span></span></span>?
直接顯示每一行的內容:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{print}' netstat.txt </span> Active Internet connections <span style="color:#999999">(</span>servers and established<span style="color:#999999">)</span> Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 0 10.10.10.140:49808 10.10.10.140:80 TIME_WAIT tcp 0 52 10.10.10.140:22 10.10.10.1:52641 ESTABLISHED tcp 0 0 10.10.10.140:22 10.10.10.1:51926 ESTABLISHED tcp 0 0 10.10.10.140:22 10.10.10.1:52640 ESTABLISHED tcp 0 0 10.10.10.140:49806 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49804 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49810 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:49812 10.10.10.140:80 TIME_WAIT <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span></span>這里直接給出print相當于'print $0'。
?
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{print ""}' netstat.txt </span><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray">#</span></code></span></span></span>awk會遍歷整個文件,文件有多少行,這里就顯示多少個空白行。
?
?
(2) printf語句:
語法:
printf "FORMAT" item1, item2, ...
?
功效:
格式化輸出:將各個item按位套進"FORMAT"指定的格式中,實現按照一定格式輸出的功能。
?
要點:
①"FORMAT"必須給出;
②不會自動換行,需要顯式給出換行控制符"\n";
③"FORMAT"需要分別為后面的每個item指定一個格式化符號(格式符).
?
格式符:
%c:顯示為字符的ASCII碼;
%d, %i:顯示為十進制整數;
%e, %E:以科學計數法顯示;
%f:顯示為浮點數;
%g, %G:以科學計數法或浮點形式顯示;
%s:顯示為字符串;
%u:顯示為無符號整數;
%%:顯示%自身.
?
修飾符:
每個格式符還可以有修飾符,可通過在格式符前面加上一些修飾符以控制其格式的顯示機制,因此稱為修飾符。
常用的修飾符有:
?#[.#]:第一個數字#表示顯示的寬度,后一個#表示小數點后的精度;
?-:以左對齊形式顯示;
+:顯示為數值的符號.
?
示例:
格式化輸出/etc/passwd文件中的第1列:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '{printf "Username: %s\n",$1}' /etc/passwd</span> Username: root Username: bin Username: daemon Username: adm Username: lp Username: <span style="color:#dd4a68">sync</span> Username: <span style="color:#dd4a68">shutdown</span> Username: halt</code></span></span></span>?
格式化輸出/etc/passwd文件中的第1列和第3列:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '{printf "Username: %-18s UID: %d\n",$1,$3}' /etc/passwd</span> Username: root UID: 0 Username: bin UID: 1 Username: daemon UID: 2 Username: adm UID: 3 Username: lp UID: 4 Username: <span style="color:#dd4a68">sync</span> UID: 5 Username: <span style="color:#dd4a68">shutdown</span> UID: 6 Username: halt UID: 7</code></span></span></span>?
(3) 操作符
操作符可用于PATTERN或者Action中,操作符主要有以下幾種:
?
①算術操作符
雙目:
x+y, x-y, x*y, x/y, x%y, x^y
?
單目:
-x:取相反數;
+x:將字符串轉換為數值;? ?
?
②字符串操作符
字符串切片:使用gawk內建函數;
沒有符號的操作符:表示連接字符串;
?
③比較操作符
>, >=
<, <=
==, !=
?
④賦值操作符
+=, -=, *=, /=, ^=, %=
++, --
?
⑤模式匹配符
~:左側的字符串是否被右側的PATTERN匹配;
!~:左側的字符串是否不被右側的PATTERN匹配;
?
⑥邏輯操作符
&&
||
!
?
⑦函數調用
無參數時:
function_name()
?
有參數時:
function_name(argu1, argu2, ...)
?
⑧條件表達式
selector?if-true-expression:if-false-expression
?
?
解釋:
"selector"是一個條件表達式,如果"selector"為真,則執行"if-true-expression"語句;如果"selector"為假,則執行"if-false-expression"語句。
?
示例:
顯示系統上所有用戶的用戶名及其說明其用戶類型:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '{$3>=1000?usertype="Common User":usertype="Sysadmin or Sys</span> User<span style="color:#669900">";printf "</span>Username: %-18s Usertype: %s\n",<span style="color:#ee9900">$1</span>,usertype<span style="color:#999999">}</span>' /etc/passwd Username: root Usertype: Sysadmin or SysUser Username: bin Usertype: Sysadmin or SysUser Username: daemon Usertype: Sysadmin or SysUser Username: adm Usertype: Sysadmin or SysUser Username: lp Usertype: Sysadmin or SysUser Username: <span style="color:#dd4a68">sync</span> Usertype: Sysadmin or SysUser Username: <span style="color:#dd4a68">shutdown</span> Usertype: Sysadmin or SysUser Username: halt Usertype: Sysadmin or SysUser Username: mail Usertype: Sysadmin or SysUser Username: operator Usertype: Sysadmin or SysUser Username: games Usertype: Sysadmin or SysUser Username: <span style="color:#dd4a68">ftp</span> Usertype: Sysadmin or SysUser Username: nobody Usertype: Sysadmin or SysUser Username: systemd-bus-proxy Usertype: Sysadmin or SysUser Username: systemd-network Usertype: Sysadmin or SysUser Username: dbus Usertype: Sysadmin or SysUser Username: polkitd Usertype: Sysadmin or SysUser Username: tss Usertype: Sysadmin or SysUser Username: postfix Usertype: Sysadmin or SysUser Username: sshd Usertype: Sysadmin or SysUser Username: tab Usertype: Common User Username: geoclue Usertype: Sysadmin or SysUser Username: apache Usertype: Sysadmin or SysUser Username: centos Usertype: Common User</code></span></span></span>?
操作符同樣可使用在PATTERN中,例如,如果系統用戶的uid大于1000,則顯示其用戶名和uid(使用關系表達式):
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '$3>=1000{printf "Username: %-10s UID: %s\n",$1,$3}' /etc/p</span> asswd Username: tab UID: 1000 Username: centos UID: 1001</code></span></span></span>?
(4) if-else條件判斷
語法格式:
if(condition) {statements}
if(condition) {statements} else {statements}
?
注意:
如果"{statements}"中只有一個語句時,則其花括號'{}'可省略;如果"{statements}"中有多個語句時,則其花括號'{}'不可省略!以下其他語句類同。
?
使用場景:
需要對awk取得的整行或某個字段做條件判斷時使用。
?
用法示例:
像剛才的例子,如果系統用戶的uid大于1000,則顯示其用戶名和uid,這里使用if-else語句來實現:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '{if($3>=1000) {printf "Username: %-18s Usertype: Common Us</span> er\n<span style="color:#669900">",<span style="color:#ee9900">$1</span>} else {printf "</span>Username: %-18s Usertype: Sysadmin or SysUser\n",<span style="color:#ee9900">$1</span><span style="color:#999999">}</span><span style="color:#999999">}</span>' /etc/pas swd Username: root Usertype: Sysadmin or SysUser Username: bin Usertype: Sysadmin or SysUser Username: daemon Usertype: Sysadmin or SysUser Username: adm Usertype: Sysadmin or SysUser Username: lp Usertype: Sysadmin or SysUser Username: <span style="color:#dd4a68">sync</span> Usertype: Sysadmin or SysUser Username: <span style="color:#dd4a68">shutdown</span> Usertype: Sysadmin or SysUser Username: halt Usertype: Sysadmin or SysUser Username: mail Usertype: Sysadmin or SysUser Username: operator Usertype: Sysadmin or SysUser Username: games Usertype: Sysadmin or SysUser Username: <span style="color:#dd4a68">ftp</span> Usertype: Sysadmin or SysUser Username: nobody Usertype: Sysadmin or SysUser Username: systemd-bus-proxy Usertype: Sysadmin or SysUser Username: systemd-network Usertype: Sysadmin or SysUser Username: dbus Usertype: Sysadmin or SysUser Username: polkitd Usertype: Sysadmin or SysUser Username: tss Usertype: Sysadmin or SysUser Username: postfix Usertype: Sysadmin or SysUser Username: sshd Usertype: Sysadmin or SysUser Username: tab Usertype: Common User Username: geoclue Usertype: Sysadmin or SysUser Username: apache Usertype: Sysadmin or SysUser Username: centos Usertype: Common User</code></span></span></span>?
顯示以"/bin/bash"為默認shell的用戶的用戶名:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '{if($NF~/\/bin\/bash$/) print $1}' /etc/passwd</span> root tab centos或者: <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '{if($NF=="/bin/bash") print $1}' /etc/passwd</span> root tab centos</code></span></span></span>這里分別結合了模式匹配符或比較操作符來實現。
?
以空白字符為分隔符,顯示/etc/fstab文件中字段數大于等于10的行:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{if(NF>=10) print}' /etc/fstab </span> <span style="color:slategray"># Created by anaconda on Sun Feb 19 10:02:11 2017</span> <span style="color:slategray"># See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info</span>或者: <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'NF>=10' /etc/fstab</span> <span style="color:slategray"># Created by anaconda on Sun Feb 19 10:02:11 2017</span> <span style="color:slategray"># See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info</span></code></span></span></span>?
顯示空間使用率大于等于20%的掛載設備:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># df -h</span> Filesystem Size Used Avail Use% Mounted on /dev/mapper/cl-root 50G 9.6G 41G 20% / devtmpfs 982M 0 982M 0% /dev tmpfs 993M 0 993M 0% /dev/shm tmpfs 993M 8.7M 984M 1% /run tmpfs 993M 0 993M 0% /sys/fs/cgroup /dev/sda1 1014M 121M 894M 12% /boot /dev/mapper/cl-home 67G 33M 67G 1% /home /dev/sr0 7.8G 7.8G 0 100% /media/cdrom tmpfs 199M 0 199M 0% /run/user/0 <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># </span> <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># df -h | awk -F% '{print $1}' | awk '/\/dev/{if($NF>=20) print $1}'</span> /dev/mapper/cl-root /dev/sr0</code></span></span></span>?
(5) while循環
語法格式:
while(condition) {statements}
?
循環條件:
條件為“真”,進入循環;
條件為“假”,退出循環.
?
使用場景:
①對一行內的多個字段逐一進行類似處理時使用;
②對數組中的各元素逐一處理時使用.
?
用法示例:
將/etc/grub2.cfg文件中以linux16開頭(前面可能有空白符)的行篩選出來,并逐一打印每一行的的各個字段及其每個字段的字符數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '/^[[:space:]]*linux16/{i=1; while(i<=NF) {print $i,length($i);</span>i++<span style="color:#999999">}</span><span style="color:#999999">}</span>' /etc/grub2.cfg linux16 7 /vmlinuz-3.10.0-514.el7.x86_64 30 root<span style="color:#9a6e3a">=</span>/dev/mapper/cl-root 24 ro 2 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/root 17 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/swap 17 rhgb 4 quiet 5 LANG<span style="color:#9a6e3a">=</span>en_US.UTF-8 16 linux16 7 /vmlinuz-0-rescue-c3d796ed1ad340e09e3d9024cd0350bf 50 root<span style="color:#9a6e3a">=</span>/dev/mapper/cl-root 24 ro 2 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/root 17 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/swap 17 rhgb 4 quiet 5</code></span></span></span>注意:這里調用了awk的內建函數length()。
?
進一步:僅打印字符數大于7的字段:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '/^[[:space:]]*linux16/{i=1; while(i<=NF) {if(length($i)>7) {pr</span> int <span style="color:#ee9900">$i</span>,length<span style="color:#999999">(</span><span style="color:#ee9900">$i</span><span style="color:#999999">)</span><span style="color:#999999">}</span><span style="color:#999999">;</span> i++<span style="color:#999999">}</span><span style="color:#999999">}</span>' /etc/grub2.cfg /vmlinuz-3.10.0-514.el7.x86_64 30 root<span style="color:#9a6e3a">=</span>/dev/mapper/cl-root 24 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/root 17 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/swap 17 LANG<span style="color:#9a6e3a">=</span>en_US.UTF-8 16 /vmlinuz-0-rescue-c3d796ed1ad340e09e3d9024cd0350bf 50 root<span style="color:#9a6e3a">=</span>/dev/mapper/cl-root 24 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/root 17 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/swap 17</code></span></span></span>?
(6) do-while循環
語法格式:
do {statements} while(condition)
?
解釋:
無論是否符合條件,先執行一次,而后再根據condition進行判斷。
?
使用場景:
至少需要執行一次時使用。
?
?
(7) for循環
語法格式:
for(expr1;expr2;expr3) {statements}
?
詳細格式:
for(variable assignment;condition;iteration process) {for-body}
?
特殊用法:
for(var in array) {for-body}
?
?
功效:能夠遍歷數組中元素的索引;
注意:其中var為變量,array為數組名,{for-body}為循環體。前面提到,awk具有內生循環功能,能自動實現對文本所有行的遍歷,而這里的for循環則可實現每一行的字段間的循環。
?
用法示例:
還是剛才的例子,將/etc/grub2.cfg文件中以linux16開頭(前面可能有空白符)的行篩選出來,并逐一打印每一行的的各個字段及其每個字段的字符數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '/^[[:space:]]linux16/{for(i=1;i<=NF;i++) {print $i,length($i)}}' /etc/grub2.cfg </span> linux16 7 /vmlinuz-3.10.0-514.el7.x86_64 30 root<span style="color:#9a6e3a">=</span>/dev/mapper/cl-root 24 ro 2 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/root 17 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/swap 17 rhgb 4 quiet 5 LANG<span style="color:#9a6e3a">=</span>en_US.UTF-8 16 linux16 7 /vmlinuz-0-rescue-c3d796ed1ad340e09e3d9024cd0350bf 50 root<span style="color:#9a6e3a">=</span>/dev/mapper/cl-root 24 ro 2 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/root 17 rd.lvm.lv<span style="color:#9a6e3a">=</span>cl/swap 17 rhgb 4 quiet 5</code></span></span></span>?
(8) switch語句
語法格式:
switch(condition) {case VALUE1 or /REGEXP/: statement; case VALUE2 or /REGEXP2/: statement; case VALUE3 or /REGEXP3/: statement; default: statement}
?
(9) break和continue
功效:
break [n]:跳出n層循環;
continue:結束本輪循環并提前進入下一輪循環;
?
(10) next
功效:
next:提前結束對本行的處理而直接進入下一行。
?
next和continue的區別:
continue是用于控制行內字段間的跳轉的,而next是用于控制awk的內生循環以實現在行間的跳轉,從而提前進入下一行處理。
?
示例:
顯示uid為偶數的用戶的用戶名和uid
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk -F: '{if($3%2!=0) next; print $1,$3}' /etc/passwd</span> root 0 daemon 2 lp 4 <span style="color:#dd4a68">shutdown</span> 6 mail 8 games 12 <span style="color:#dd4a68">ftp</span> 14 systemd-network 192 polkitd 998 sshd 74 tab 1000 apache 48</code></span></span></span>?
(11) 數組(array)
定義數組:
array[index-expression]
?
注意:array為自定義的數組名,index-expression為索引表達式,awk支持索引數組和關聯數組,而在awk中關聯數組更為常用。
?
刪除數組:
delete array[index-expression]:刪除數組中的某元素;
delete array:刪除數組.
?
要點:
①awk支持關聯數組,可使用任何字符串;字符串要使用雙引號;
②如果數組中的某個元素事先不存在,在引用時,awk會自動創建這個元素,并將其值初始化為“空串”;而如果引用一個事先不存在的數組元素來做數值運算時,則會自動將該元素賦值為0.
③若要判斷數組中是否存在某元素,不可以直接引用,因為一引用就會自動創建(一創建就相當于存在了),而應該使用"index in array"格式來做判斷。
④若要遍歷整個數組中的元素,則要使用for循環:for(var in array) {for-body}.
?
用法示例:
創建一個關聯數組,數組第一個元素的索引為"mon",值為"Monday",第二個元素的索引為"tue",值為"Tuesday";創建之后顯示指定的數組元素:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{weekdays["mon"]="Monday";weekdays["tue"]="Tuesday";print</span>weekdays<span style="color:#999999">[</span><span style="color:#669900">"mon"</span><span style="color:#999999">]</span><span style="color:#999999">}</span>' Monday</code></span></span></span>當不需要對文件進行處理時,可在前面帶上"BEGIN"。
?
進一步:遍歷顯示數組中的所有元素:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{weekdays["mon"]="Monday";weekdays["tue"]="Tuesday";for(i</span><span style="color:#0077aa">in</span> weekdays<span style="color:#999999">)</span> <span style="color:#999999">{</span>print weekdays<span style="color:#999999">[</span>i<span style="color:#999999">]</span><span style="color:#999999">}</span><span style="color:#999999">}</span>' Tuesday Monday</code></span></span></span>使用for的特殊用法去遍歷數組中的每個元素時,例如此處,awk會將數組'weekdays'中的每個元素的索引賦值給變量i,而不是直接將數組的元素直接賦值給變量i。此處遍歷的順序可能和我們預想的不一致。
?
判斷數組中的某個元素是否存在:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{testarray["a"]="hello"; testarray["b"]="hi"; if("a" in t</span> estarray<span style="color:#999999">)</span> <span style="color:#999999">{</span>print <span style="color:#669900">"exist."</span><span style="color:#999999">}</span> <span style="color:#0077aa">else</span> print <span style="color:#669900">"not exist."</span> <span style="color:#999999">}</span><span style="color:#669900">' exist. [root@localhost ~]# [root@localhost ~]# awk '</span>BEGIN<span style="color:#999999">{</span>testarray<span style="color:#999999">[</span><span style="color:#669900">"a"</span><span style="color:#999999">]</span><span style="color:#9a6e3a">=</span><span style="color:#669900">"hello"</span><span style="color:#999999">;</span> testarray<span style="color:#999999">[</span><span style="color:#669900">"b"</span><span style="color:#999999">]</span><span style="color:#9a6e3a">=</span><span style="color:#669900">"hi"</span><span style="color:#999999">;</span> if<span style="color:#999999">(</span><span style="color:#669900">"c"</span> <span style="color:#0077aa">in</span> t estarray<span style="color:#999999">)</span> <span style="color:#999999">{</span>print <span style="color:#669900">"exist."</span><span style="color:#999999">}</span> <span style="color:#0077aa">else</span> print <span style="color:#669900">"not exist."</span> <span style="color:#999999">}</span>' not exist.</code></span></span></span>?
根據前面提到,如果我們要引用數組中的某元素,而這個元素事先是不存在的,那么當我們引用時awk會自動創建這個元素,并且初始化為“空串”,當然了,如果我們直接把這個事先不存在的數組元素拿去做運算,則會自動初始化為“0”。在生產環境中,利用這一特性,我們可以實現一個通用而且非常實用的功能:當我們需要統計在文本中某些字符串各自分別出現的次數時,可以直接把這些字符串分別作為一個數組的索引,并且使得在各個要統計的字符串每次出現時,以該字符串作為索引的數組元素自動加1.
說起來有點抽象,接下來我們看一看如何利用這一特性:
?
用netstat命令查看當前網路連接狀態:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># netstat -tan</span> Active Internet connections <span style="color:#999999">(</span>servers and established<span style="color:#999999">)</span> Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 0 10.10.10.140:22 10.10.10.1:52641 ESTABLISHED tcp 0 52 10.10.10.140:22 10.10.10.1:51926 ESTABLISHED tcp 0 0 10.10.10.140:51816 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:51818 10.10.10.140:80 TIME_WAIT tcp 0 0 10.10.10.140:22 10.10.10.1:52640 ESTABLISHED tcp 0 0 10.10.10.140:51814 10.10.10.140:80 TIME_WAIT tcp6 0 0 :::80 :::* LISTEN tcp6 0 0 :::22 :::* LISTEN tcp6 0 0 ::1:25 :::* LISTEN tcp6 0 0 10.10.10.140:80 10.10.10.140:49844 TIME_WAIT tcp6 0 0 10.10.10.140:80 10.10.10.140:50304 TIME_WAIT tcp6 0 0 10.10.10.140:80 10.10.10.140:49828 TIME_WAIT</code></span></span></span>?
統計netstat命令執行結果中tcp連接的各個狀態的個數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># netstat -tan | awk '/tcp\>/{state[$NF]++}END{for(i in state) print i,state[i]}'</span> LISTEN 2 ESTABLISHED 3 TIME_WAIT 3</code></span></span></span>命令解釋:
這里首先錨定匹配以'tcp'開頭的行;接著把匹配到的每一行的最后一個字段(即狀態)賦值為數組state的索引,這里因為事先state[$NF]不存在,所以awk會自動創建之,并將對應的數組元素初始化為“空串”;但又因為這里直接將數組元素用于自增運算,因此初始值為0,以用于做自增運算,而我們要統計的各個狀態每出現一次,就會自動把這個狀態所對應的數組state的索引對應的元素值加1.
最后我們只需要用for循環遍歷一次數組state的索引,并顯示索引(這里"索引"即為"狀態")以及對應的元素值(即各個狀態分別出現的次數)。
?
此外,在生產環境中通常使用ss命令結合awk工具來做統計,這樣執行性能更佳。
?
統計/var/log/httpd/access_log日志文件中各個ip地址從本地Web服務器獲取資源的次數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '{ip[$1]++}END{for(i in ip) print i,ip[i]}' /var/log/httpd/acce</span> ss_log 10.10.10.138 1014 10.10.10.1 221 10.10.10.139 4024 10.10.10.140 4000</code></span></span></span>?
查看/etc/fstab文件內容:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># cat /etc/fstab </span><span style="color:slategray">#</span> <span style="color:slategray"># /etc/fstab</span> <span style="color:slategray"># Created by anaconda on Tue Oct 4 09:06:12 2016</span> <span style="color:slategray">#</span> <span style="color:slategray"># Accessible filesystems, by reference, are maintained under '/dev/disk'</span> <span style="color:slategray"># See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info</span> <span style="color:slategray">#</span> UUID<span style="color:#9a6e3a">=</span>60eb0d1c-9834-4348-9a79-2f91983a8ede / ext4 defaults 1 1 UUID<span style="color:#9a6e3a">=</span>3995456f-a3e9-4b69-a0de-0fd48068da39 /boot ext4 defaults 1 2 UUID<span style="color:#9a6e3a">=</span>01d24b59-eee4-472e-9985-ef44fd5e059c swap swap defaults 0 0 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid<span style="color:#9a6e3a">=</span>5,mode<span style="color:#9a6e3a">=</span>620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/sr0 /media/cd iso9660 defaults 1 2</code></span></span></span>統計/etc/fstab文件中每個文件系統類型出現的次數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk '/^UUID/{fs[$3]++}END{for(i in fs) print i,fs[i]}' /etc/fstab </span> swap 1 ext4 2</code></span></span></span>?
統計指定文件中每個單詞出現的次數,并統計出使用次數最多的英語單詞及其出現次數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># cat test.txt //打開一篇英語文章;</span> I am too busy now. The life is different <span style="color:#0077aa">in</span> the past and now. Let me tell you about my life. In the past,I didn<span style="color:#669900">'t study or do some things. I always played with my parents. My dad often took me to go to the zoo. That was really interesting! But now as a student, I have to stay in school all day. At home, I have too much homework to do.So I have to do homework. Time is flies. I miss past. I hope it back soon. [root@localhost ~]# [root@localhost ~]# [root@localhost ~]# awk '</span><span style="color:#999999">{</span>for<span style="color:#999999">(</span>i<span style="color:#9a6e3a">=</span>1<span style="color:#999999">;</span>i<span style="color:#9a6e3a"><=</span>NF<span style="color:#999999">;</span>i++<span style="color:#999999">)</span> <span style="color:#999999">{</span>count<span style="color:#999999">[</span><span style="color:#ee9900">$i</span><span style="color:#999999">]</span>++<span style="color:#999999">}</span><span style="color:#999999">}</span>END<span style="color:#999999">{</span>for<span style="color:#999999">(</span>i <span style="color:#0077aa">in</span> count<span style="color:#999999">)</span> <span style="color:#999999">{</span>print i,c ount<span style="color:#999999">[</span>i<span style="color:#999999">]</span><span style="color:#999999">}</span><span style="color:#999999">}</span>' test.txt <span style="color:#9a6e3a">|</span> <span style="color:#dd4a68">sort</span> -r -n -k2 <span style="color:#9a6e3a">|</span> <span style="color:#dd4a68">head</span> -5 I 7 to 5 the 3 have 3 too 2</code></span></span></span>?
(12) 函數
awk的函數有兩種類型,一種是內置函數,一種是自定義函數。這里我們介紹一下awk的內置函數吧!
?
數值處理:
rand():返回0和1之間的一個隨機數.
?
實例:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># awk 'BEGIN{print rand()}'</span> 0.237788</code></span></span></span>注意:只有第一次取得的數是隨機的,之后會一直沿用第一次取得的值。
?
字符串處理:
length():返回指定字符串的長度;
sub(r,s,[t]):以r表示的模式去查找t所表示的字符串中匹配到的內容,并將其第一次匹配到的內容替換為s所表示的內容。
gsub(r,s,[t]):以r表示的模式去查找t所表示的字符串中匹配到的內容,并將其第所有匹配到的內容替換為s所表示的內容。
split(s,a[,r]):以r為分隔符切割字符s,并將其切割后的結果保存至a所表示的數組中。
?
注意:這里a所表示的數組的索引從1開始,而字段的各個切片分別保存至索引1,2,3,...中。
?
實例:
統計所有客戶端IP地址及其出現的次數:
<span style="color:#333333"><span style="color:#333333"><span style="color:black"><code class="language-bash"><span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># netstat -tan</span> Active Internet connections <span style="color:#999999">(</span>servers and established<span style="color:#999999">)</span> Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 52 10.10.10.140:22 10.10.10.1:51134 ESTABLISHED tcp 0 0 10.10.10.140:22 10.10.10.1:51531 TIME_WAIT tcp6 0 0 :::80 :::* LISTEN tcp6 0 0 :::22 :::* LISTEN tcp6 0 0 ::1:25 :::* LISTEN <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># </span> <span style="color:#999999">[</span>root@localhost ~<span style="color:#999999">]</span><span style="color:slategray"># netstat -tan | awk '/^tcp\>/{split($5,ip,":"); count[ip[1]]++}END{for</span> <span style="color:#999999">(</span>i <span style="color:#0077aa">in</span> count<span style="color:#999999">)</span> <span style="color:#999999">{</span>print i,count<span style="color:#999999">[</span>i<span style="color:#999999">]</span><span style="color:#999999">}</span><span style="color:#999999">}</span>' 10.10.10.1 1 0.0.0.0 2</code></span></span></span>命令解釋:
首先awk會匹配以tcp開頭的行,接著split()函數將模式(/^tcp\>/)匹配到的行的第5個字段以":"作為分隔符做切分,各個切片依次保存至ip所表示的數組中;而這里split()函數參數中,ip所表示的數組的索引是從1開始計數的,因此第5個字段":"左側的ip地址保存至ip[1],":"右側的端口號保存至ip[2];這里我們需要統計的是ip地址,所以按照前面幾個例題的思路,這里使用了數組元素嵌套的方法將ip[1]的值作為數組count的索引。
文章來源:http://blog.51cto.com/xuweitao/1905269
總結
- 上一篇: 时间戳转时间到秒
- 下一篇: piranha启动报错