2012-05-23 106 views
1

我有一個包含等數據日誌文件,這樣的詩句:解析文本文件行

2012-05-23T20:52:11+00:00 heroku[router]: GET myapp.com/practitioner_activities/10471/edit dyno=web.2 queue=0 wait=0ms service=866ms status=200 bytes=48799 
2012-05-23T20:52:46+00:00 heroku[router]: GET myapp.com/users/sign_out dyno=web.1 queue=0 wait=0ms service=20ms status=302 bytes=88 
2012-05-23T20:52:46+00:00 heroku[router]: GET myapp.com/ dyno=web.13 queue=0 wait=0ms service=18ms status=200 bytes=4680 
2012-05-23T20:53:04+00:00 heroku[router]: POST myapp.com/p/ENaCXExu7qNEqzwYYyPs dyno=web.5 queue=0 wait=0ms service=207ms status=302 bytes=119 
2012-05-23T20:53:04+00:00 heroku[router]: GET myapp.com/practitioner_activities/welcome dyno=web.3 queue=0 wait=0ms service=57ms status=200 bytes=5061 
2012-05-23T20:53:04+00:00 heroku[router]: GET myapp.com/assets/application-print-715276cc0b76d0d82db3ab5866f22a23.css dyno=web.14 queue=0 wait=0ms service=9ms status=200 bytes=76386 

我想解析並將其轉儲到一個文件,我可以用Excel打開分析。我需要小時,分鐘,動詞(GET或POST),url和'service ='時間。

例如,對於上面的第一行:

2012-05-23T20:52:11+00:00 heroku[router]: GET myapp.com/practitioner_activities/10471/edit dyno=web.2 queue=0 wait=0ms service=866ms status=200 bytes=48799 

我期望的輸出看起來是這樣的:

"20", "52", "GET", "myapp.com/practitioner_activities/10471/edit", "866" 

我會怎麼做這awk或短的紅寶石腳本?

回答

3

使用awk,你可以嘗試這樣的:

awk '{ OFS="\", \""; split ($8, array, "="); printf "\"" substr ($1 , length ($1) - 13, 2) OFS substr ($1 , length ($1) - 10, 2) OFS $3 OFS $4 OFS substr (array[2], 0, length (array[2]) -2) "\"\n" }' file.txt 

結果:

"20", "52", "GET", "myapp.com/practitioner_activities/10471/edit", "866" 
"20", "52", "GET", "myapp.com/users/sign_out", "20" 
"20", "52", "GET", "myapp.com/", "18" 
"20", "53", "POST", "myapp.com/p/ENaCXExu7qNEqzwYYyPs", "207" 
"20", "53", "GET", "myapp.com/practitioner_activities/welcome", "57" 
"20", "53", "GET", "myapp.com/assets/application-print-715276cc0b76d0d82db3ab5866f22a23.css", "9" 

HTH

編輯:

awk '{ OFS="\", \""; ORS="\"\n"; split ($8, array, "="); print "\"" substr ($1 , 12, 2), substr ($1 , 15, 2), $3, $4, array[2] + 0 }' file.txt 

謝謝丹尼斯!該代碼是非常非常好的,現在:-)

+0

有很大幫助,謝謝。 –

+0

很高興我可以幫忙:-) – Steve

+0

你可以使用'ORS =「\」\ n「'和'print'用逗號(而不是明確地說'OFS',但是你仍然會設置'OFS' )而不是'printf'(因爲你沒有使用格式化字符串),或者你可以'printf'\「%d」OFS「%d」OFS「%s」OFS「%s」OFS「%d \」 \ n「,substr($ 1,length($ 1) - 13,2),substr($ 1,length($ 1) - 10,2),$ 3,$ 4,substr(array [2],0,length(array [2 ]) - 2)'將表示和數據分開,在最後一種情況下,你可以使用比OFS更短的變量名,使用'length'並不是必須的,時間戳是固定長度的依靠... –

1

紅寶石回答

ruby -ane ' 
    hr, min = $F[0][/(?<=T)\d\d:\d\d/].split(/:/) 
    svc = $F[7].split(/=/)[-1]; svc[/ms/] = "" 
    puts %q{"%d", "%d", "%s", "%s", "%d"} % [hr, min, $F[2], $F[3], svc] 
' logfile