2012-04-01 34 views
2

我發現Ruby 1.9.3中的CSV解析非常脆弱。正因如此,我想知道如果我做錯了什麼遇到引號時Ruby CSV.parse非常挑剔

如果我這樣做的IRB下面我得到一個錯誤:

1.9.3-p125 :011 > require 'csv' 
=> true 
1.9.3-p125 :012 > a = 'one,two,three, "four, five",six' 
=> "one,two,three, \"four, five\",six" 
1.9.3-p125 :013 > arr = CSV.parse(a) 
CSV::MalformedCSVError: Illegal quoting in line 1. 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift' 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1887:in `each' 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1887:in `block in shift' 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1849:in `loop' 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1849:in `shift' 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1791:in `each' 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1805:in `to_a' 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1805:in `read' 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1379:in `parse' 
    from (irb):13 
    from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in `<main>' 

我發現這個問題是之前的額外空間「四,五「價值。如果我刪除了空間,那麼它就可以工作。

1.9.3-p125 :010 > a = 'one,two,three,"four, five",six' 
=> "one,two,three,\"four, five\",six" 
1.9.3-p125 :011 > arr = CSV.parse(a) 
=> [["one", "two", "three", "four, five", "six"]] 

其他值前面的空格不會導致問題。下面的解析就好了

one, two, three,"four, five", six 

是否有一些解析選項我缺少使用引用值如此脆弱?

+1

可能的重複:http://stackoverflow.com/questions/1807942/overcoming-a-basic-problem-with-csv-parsing-using-the-fastercsv-gem – WarHog 2012-04-01 16:23:59

+0

我會買這個答案。謝謝@WarHog! – 2012-04-01 16:47:04

回答

3

這是正確的行爲。它並不脆弱。

「4」後的逗號結束該字段,下一個字段立即以空格開始。

你不能有效地在一個字段的中間(不轉義它)報價。

+0

根據RFC http://tools.ietf.org/html/rfc4180#page-2,如果您使用雙引號括起來,則可以在該字段中使用逗號。 – 2013-07-08 18:58:15

相關問題