如果它只是沒有{"over the"=>"under the"}
的單詞,那麼我認爲像這樣的大多數解決方案都會像掃描字符串一樣重複。
首先,我的數組轉換成一個純粹的哈希
h=Hash.new
@dictionary.each {|ft| h[ft[:from]]=ft[:to]}
=> {"quick"=>"lazy", "over the"=>"under the", "jumps"=>"flies"}
然後我用字
@text.split(/ /).each{|w| h[w] || w}.join(" ")
=> "lazy brown fox flies over the lazy dog"
掃描串詞也不會從多個替代問題的困擾。
h["brown"]="quick"
=> {"brown"=>"quick", "quick"=>"lazy", "over the"=>"under the", "jumps"=>"flies"}
@text.split(/ /).each{|w| h[w] || w}.join(" ")
=> "lazy quick fox flies over the lazy dog"
我做了一些基準,我不得不比我以前的想法上面的解決方案得到了比gsub!
快增添了不少替代對。
require 'benchmark'
@dictionary = [{:to=>"lazy", :from=>"quick"}, {:to=>"flies", :from=>"jumps"}, {:from => "over the", :to => "under the"}]
@text = "quick brown fox jumps over the lazy dog" * 10000
Benchmark.bm do |benchmark|
benchmark.report do
h=Hash.new
@dictionary.each {|ft| h[ft[:from]]=ft[:to]}
[email protected](/ /).each{|w| h[w] || w}.join(' ')
end
benchmark.report do
@dictionary.each { |pair| @text.gsub!(/#{pair[:from]}/, pair[:to]) }
end
@dictionary+=[{:to=>"black", :from=>"brown"}, {:to=>"ox", :from=>"fox"}, {:to=>"hazy", :from=>"lazy"}, {:to=>"frog", :from=>"dog"}]
@[email protected]*15
benchmark.report do
h=Hash.new
@dictionary.each {|ft| h[ft[:from]]=ft[:to]}
[email protected](/ /).each{|w| h[w] || w}.join(' ')
end
benchmark.report do
@dictionary.each { |pair| @text.gsub!(/#{pair[:from]}/, pair[:to]) }
end
end
結果:
user system total real
0.890000 0.060000 0.950000 ( 0.962106)
0.200000 0.020000 0.220000 ( 0.217235)
0.980000 0.060000 1.040000 ( 1.042783)
0.980000 0.030000 1.010000 ( 1.011380)
的gsub!
解決方案是快4.5倍,只有三個替代對。 在105個替換對中,split
解決方案最終速度一樣快,但實際上只有105個替換對比三個速度慢了10%。 gsub!
慢了五倍。
將字典存儲爲散列是否更有意義? '@dictionary = {'lazy'=>'quick','flies'=>'jumps','''在'}'下的'=>' – kejadlen 2010-02-09 18:32:20