2013-05-27 28 views
16

的Ubuntu 12.04 LTS紅寶石無法解析的CSV文件:CSV :: MalformedCSVError(非法引用在第1行)

紅寶石紅寶石1.9.3dev(2011-09-23的修訂33323)[i686- linux下]

的Rails 3.2.9

以下是我收到的CSV文件的內容:

"date/time","settlement id","type","order id","sku","description","quantity","marketplace","fulfillment","order city","order state","order postal","product sales","shipping credits","gift wrap credits","promotional rebates","sales tax collected","selling fees","fba fees","other transaction fees","other","total" 
"Mar 1, 2013 12:03:54 AM PST","5481545091","Order","108-0938567-7009852","ALS2GL36LED","Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor","1","amazon.com","Amazon","Pasadena","CA","91104-1056","43.00","3.25","0","-3.25","0","-6.45","-3.75","0","0","32.80" 

然而,當我試圖分析我收到錯誤的CSV文件:

1.9.3dev :016 > options = { col_sep: ",", quote_char:'"' } 
=> {:col_sep=>",", :quote_char=>"\""} 

1.9.3dev :022 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row } 
CSV::MalformedCSVError: Illegal quoting in line 1. 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift' 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each' 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift' 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop' 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift' 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each' 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach' 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open' 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach' 
    from (irb):22 
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>' 

然後我試圖簡化了數據,即

"name","age","email" 
"jignesh","30","[email protected]" 

但我仍得到相同的錯誤:

 1.9.3dev :023 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row } 
    CSV::MalformedCSVError: Illegal quoting in line 1. 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift' 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each' 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift' 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop' 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift' 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each' 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach' 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open' 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach' 
     from (irb):23 
     from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>' 

我再次嘗試簡化這樣的數據:

name,age,email 
jignesh,30,[email protected] 

,並works.See以下的輸出:

1.9.3dev :024 > CSV.foreach("/tmp/my_data.csv") { |row| puts row } 
    name 
    age 
    email 
    jignesh 
    30 
    [email protected] 
    => nil 

但我會接受CSV文件中已經引用數據,從而去掉引號,解決方案實際上沒有我要找for.I我無法弄清楚什麼導致了錯誤:CSV :: MalformedCSVError:非法引用行1中。在我以前的例子。

我已驗證在CSV中沒有前導/尾隨空格,因此在我的文本編輯器中啓用「顯示空白字符」和「顯示行結束符」。此外,我使用以下代碼驗證了編碼。

1.9.3dev :026 > File.open("/tmp/my_data.csv").read.encoding 
    => #<Encoding:UTF-8> 

注:我嘗試使用CSV.read,但同樣的錯誤與該方法。

任何人都可以請幫我解決問題,讓我明白哪裏出問題了嗎?

=====================

我剛剛發現下面的帖子在:http://www.ruby-forum.com/topic/448070和嘗試以下操作:

file_data = file.read 
    file_data.gsub!('"', "'") 
    arr_of_arrs = CSV.parse(file_data) 

    arr_of_arrs.each do |arr| 
    Rails.logger.debug "=======#{arr}" 
    end 

,並得到下面的輸出:

=======["\xEF\xBB\xBF'date/time'", "'settlement id'", "'type'", "'order id'", "'sku'", "'description'", "'quantity'", "'marketplace'", "'fulfillment'", "'order city'", "'order state'", "'order postal'", "'product sales'", "'shipping credits'", "'gift wrap credits'", "'promotional rebates'", "'sales tax collected'", "'selling fees'", "'fba fees'", "'other transaction fees'", "'other'", "'total'"] 
    =======["'Mar 1", " 2013 12:03:54 AM PST'", "'5481545091'", "'Order'", "'108-0938567-7009852'", "'ALS2GL36LED'", "'Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor'", "'1'", "'amazon.com'", "'Amazon'", "'Pasadena'", "'CA'", "'91104-1056'", "'43.00'", "'3.25'", "'0'", "'-3.25'", "'0'", "'-6.45'", "'-3.75'", "'0'", "'0'", "'32.80'"] 

其中弄亂了正確讀取數據所用的默認col_sep是逗號字符。 但是我試着用quote_char選項是這樣的:

arr_of_arrs = CSV.parse(file_data, :quote_char => "'") 

,但它結束了以下錯誤:

CSV::MalformedCSVError (Illegal quoting in line 1.): 

感謝, Jignesh

+1

使用您提供的樣本數據和解析工作正常沒有得到任何'CSV :: MalformedCSVError:非法引用第1行錯誤 –

+0

在我編輯的部分輸出包含以下內容:「\ xEF \ xBB \ xBF'date/time'」。是否產生了一些問題?我不知道它代表什麼.T漢克斯。 –

+3

文件開頭的Unicode字符是BOM(字節順序標記)。你可以嘗試'sub!(/^\ xEF \ xBB \ xBF /,'')'或'CSV.foreach(「test.csv」,編碼:「bom | utf-8」)' –

回答

-3

試試這個提示:

  1. 打開你的CSV文件ILE在文本編輯器
  2. 選擇整個文件拷貝
  3. 打開一個新的文本文件
  4. CSV數據粘貼到新的文件並保存新文件
  5. 導入新的CSV文件
+0

csv文件可能位於MB中,不能像那樣打開 – user1735921

20
quote_chars = %w(" | ~^& *) 
begin 
    @report = CSV.read(csv_file, headers: :first_row, quote_char: quote_chars.shift) 
rescue CSV::MalformedCSVError 
    quote_chars.empty? ? raise : retry 
end 

這不是完美的,但它大部分時間都適用。

N.B. CSV.parse採用與CSV.read相同的參數,因此可以使用內存中的文件或數據。

10

我剛剛遇到了類似問題,並發現CSV不喜歡col-sep和引號字符之間的空格。 一旦我刪除這些一切都很好。 所以我必須:

12, "N", 12, "Pacific/Majuro" 

但使用

.gsub(/,\s+\"/,',\"') 

一次我gsubed出空間造成

12,"N", 12,"Pacific/Majuro" 

一切正常。

+1

注意如果要替換逗號值中引號字符串兩側的空格... gsub(/,\ s + \「/,','')。 gsub(/ \「\ s +,/,'」,') – bjm88

11

Anand,謝謝你的編碼建議。這解決了我的非法報價問題。

注意:如果你想迭代器跳過標題行添加headers: :first_row,像這樣:

CSV.foreach("test.csv", encoding: "bom|utf-8", headers: :first_row) 
+0

謝謝! '編碼:「bom | utf-8」'是什麼解決了我的問題。 –

1

我有這是引發此錯誤的商標字符的問題。

商標字符轉換爲\「在UTF-8,所以它是被扔錯誤的開放式報價符號所以我這樣做:。

.gsub!("\"!", "")

然後我試着創建我的CSV對象,它工作正常。