2016-05-12 34 views
0

我有一個示例日誌文件如何收集IP和用戶代理信息並uniq他們基於AWK的nginx訪問日誌IP地址?

27.151.49.215 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 1977 "http://localhost/" "Mozilla/5.0 (iPhone 6p; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/6.0 MQQBrowser/6.6.1 Mobile/12B411 Safari/8536.25" 
49.73.31.190 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 1813 "http://localhost/" "Mozilla/5.0 (iPhone 5SGLOBAL; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/6.0 MQQBrowser/6.6 Mobile/13B143 Safari/8536.25" 
114.80.188.61 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 165 "-" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36" 
223.64.63.228 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 2068 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13E238 QQ/6.2.0.427 Pixel/1080 NetType/WIFI Mem/48" 
101.251.3.75 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 1975 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/12F70 UCBrowser/10.9.14.779 Mobile" 
101.251.3.75 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 1975 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/12F70 UCBrowser/10.9.14.779 Mobile" 
101.251.3.75 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 1975 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/12F70 UCBrowser/10.9.14.779 Mobile" 
101.251.3.75 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 1975 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/12F70 UCBrowser/10.9.14.779 Mobile" 
221.204.176.30 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 165 "http://localhost/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36" 
222.77.208.177 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 2621 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Mobile/11D169 rabbit%2F1.0 baiduboxapp/0_0.0.8.6_enohpi_069_046/1.7_1C2%253enohPi/1099a/82840F498905C55D0EB7EBB0CF5DDC44BAF811E8FFCCOABNILE/1" 
221.3.134.130 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 1962 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13E238 Safari/601.1" 
123.157.71.167 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 2069 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_1 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/13E238 UCBrowser/10.9.13.779 Mobile" 
39.187.201.169 - - [10/May/2016:23:59:59 +0800] "GET /m/index.php HTTP/1.1" 200 2621 "http://localhost/" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_2_1 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/13D15 UCBrowser/10.9.14.779 Mobile" 

我要收集所有的IP和用戶代理信息到一個文件中,法和柱相同的IP地址,我該怎麼辦使用awk?

輸出,如:

4 101.251.3.75 "Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/12F70 UCBrowser/10.9.14.779 Mobile" 
1 27.151.49.215 "Mozilla/5.0 (iPhone 6p; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/6.0 MQQBrowser/6.6.1 Mobile/12B411 Safari/8536.25" 
1 49.73.31.190 "Mozilla/5.0 (iPhone 5SGLOBAL; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/6.0 MQQBrowser/6.6 Mobile/13B143 Safari/8536.25" 
1 114.80.188.61 "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36" 
1 223.64.63.228 "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Mobile/13E238 QQ/6.2.0.427 Pixel/1080 NetType/WIFI Mem/48" 
1 221.204.176.30 "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.104 Safari/537.36" 
1 22.77.208.177 "Mozilla/5.0 (iPhone; CPU iPhone OS 7_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Mobile/11D169 rabbit%2F1.0 baiduboxapp/0_0.0.8.6_enohpi_069_046/1.7_1C2%253enohPi/1099a/82840F498905C55D0EB7EBB0CF5DDC44BAF811E8FFCCOABNILE/1" 
1 221.3.134.130 "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13E238 Safari/601.1" 
1 123.157.71.167 "Mozilla/5.0 (iPhone; CPU iPhone OS 9_3_1 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/13E238 UCBrowser/10.9.13.779 Mobile" 
1 39.187.201.169 "Mozilla/5.0 (iPhone; CPU iPhone OS 9_2_1 like Mac OS X; zh-CN) AppleWebKit/537.51.1 (KHTML, like Gecko) Mobile/13D15 UCBrowser/10.9.14.779 Mobile" 
+2

請提供示例輸出... –

+0

在nginx配置文件中創建一個日誌格式以獲取該信息。新文件的第一個字段上的唯一運行。 – 123

+0

@ 123但我如何重新生成新格式的舊日誌文件? – Benc

回答

0

隨着SED,sort和uniq:

sed 's/\([^ ]*\).* \("[^"]*"\)/\1 \2/' file | sort | uniq -c 
+0

時你計算相同的IP地址你完美解決了這個問題,謝謝 – Benc

+0

@Benc:查看我的答案呢!現在添加選項也可以進行排序。 – Inian

+0

@Benc我更新了IP捕獲(它會產生相同的輸出)。 – SLePort

0
awk -F"[\"]" '{ 
    sub(/-.*$/,"",$1); 
    a[$1" \""$(NF-1)"\""]++ 
} 
END { 
    for (i in a) print a[i],i 
}' logfile 

$1將IP和$(NF-1)將有代理的信息。 使用這些值作爲索引創建數組a,併爲每次出現相似索引值時增加此數組的值。

0

初學者在awk,我想,因爲你需要這將打印內容。

awk '{printf $1 " "; s = ""; for (i = 12; i <= NF; i++) s = s $i " "; print s }' logfile | sort| uniq -c 

# Printing the 1st column containing the IP and add a whitespace 
# Print all the columns consisting of the user-agent info 
# 'sort' the list and 'uniq -c' to the count of each entry 
+0

它也可以工作!謝謝 – Benc