如何將現有的apache日誌文件按月分成單獨的文件?如何按月分割現有的apache日誌文件?
我搜遍了網頁,但找不到任何東西。是的,我知道關於logrotate和cronolog等等。但是我找到的東西並沒有幫助我分割現有的文件。
是否有awk腳本或其他?
下面是數據的一個片段:
124.115.5.11 - - [30/May/2011:23:21:37 -0500] "GET/HTTP/1.0" 200 206492 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322;TencentTraveler)"
58.61.164.39 - - [31/May/2011:00:36:35 -0500] "GET/HTTP/1.0" 200 206492 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322;TencentTraveler)"
114.80.93.55 - - [31/May/2011:01:42:17 -0500] "GET/HTTP/1.0" 200 206492 "-" "Sosospider+(+http://help.soso.com/webspider.htm)"
114.80.93.73 - - [31/May/2011:02:03:44 -0500] "GET/HTTP/1.0" 200 206492 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322;TencentTraveler)"
123.125.71.98 - - [31/May/2011:12:33:30 -0500] "GET/HTTP/1.1" 103 24576 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
220.181.108.187 - - [31/May/2011:12:33:55 -0500] "GET/HTTP/1.1" 103 24576 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
123.125.71.117 - - [31/May/2011:13:27:56 -0500] "GET/HTTP/1.1" 103 24576 "-" "Baiduspider+(+http://www.baidu.com/search/spider.htm)"
123.125.71.78 - - [31/May/2011:16:45:48 -0500] "GET /node/54 HTTP/1.1" 200 3219 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
124.115.1.8 - - [31/May/2011:19:59:58 -0500] "GET/HTTP/1.1" 200 206492 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
123.125.71.69 - - [31/May/2011:22:05:46 -0500] "GET/HTTP/1.1" 200 206492 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
這裏是我的解決辦法,按以下史蒂夫的答案極大地鼓舞:
一個使用awk
方式:
awk 'BEGIN {
split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec ", months, " ")
for (a = 1; a <= 12; a++)
m[months[a]] = a
}
{
year = array[3]
month = sprintf("%02d", m[array[2]])
split($4,array,"[:/]");
print > FILENAME"-"year"_"month".txt"
}' incendiary.ws-2009
這將輸出文件如:
incendiary.ws-2010-2010_04.txt
incendiary.ws-2010-2010_05.txt
incendiary.ws-2010-2010_06.txt
incendiary.ws-2010-2010_07.txt
而針對150 MB的日誌文件,通過chepner接受的答案了70秒上的3.4GHz的8核至強E31270,而這種方法把5秒。
最初靈感:https://stackoverflow.com/a/11714105/430062
的人,誰知道AWK(或東西:)可能不一定知道或訪問您試圖數據文件操縱,如果你可以提供一些輸入/輸出對來顯示你正在使用/想要脫離的話 – Levon 2012-07-29 23:57:26
我已經實現了你的出色建議。 – 2012-07-30 00:08:01