2010-05-24 256 views
1

我有這樣的數據文件:解析文件紅寶石

01 JUL something 
     something 
     something    445 
     something else 
01 JUL whatever 
     everwa3 
     lklkj     445 
     something else 
02 JUL ljkjlkj 
     ljkljlkj 
     lkjkjlk    500 
     lkjkj 
02 JUL ljlkjklj 
     lkjkjlkj 
     lkjkj     500 
     lkjlkj 

最後,我想找出 7月01日445有02多少OCCURENCES JUL 500有

在這種情況下,這將是..

01 JUL 445 = 2 

02 JUL 500 = 2 

我能夠在線路讀取和獲取數據了......我該怎麼去計算同樣的事情?

回答

1
counts = {} 
date = "" 
file.readlines.each_with_index do |l, i| 
    if i % 4 == 0 # first line 
    date = l.split("\t").first 
    elsif i % 4 == 3 # third line 
    wierd_num = l.split("\t").last 
    counts[date+" "+wierd_num] ||= 0 
    counts[date+" "+wierd_num] += 1 
    end 
end 

puts counts # => {"01 JUL 445" => 2, "02 JUL 500" => 2} 
+0

謝謝。雖然,現在我遇到了UTF-8字符的問題。請參閱http://stackoverflow.com/questions/2897398/broken-utf-8-string-ruby – josh 2010-05-24 13:51:56