2011-08-23 43 views
4

在特定行中存在特定值的ruby中,從CSV文件中刪除行的聰明方法是什麼?刪除文件中的行 - Ruby

下面是一個文件的例子:

350 lbs., Outrigger Footprint, 61" x 53", Weight, 767 lbs., 300-2080 
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580 
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580 
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080 
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080 

理想情況下,我想僅此創建一個新的文件:鑑於此,當

350 lbs., Outrigger Footprint, 61" x 53", Weight, 767 lbs., 300-2080 
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580 
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080 

300-2580 
300-3080 
300-2080 

所以我知道我可以用sort filename|uniq -d來做到這一點,但我想學習Ruby(有點痛苦)。

由於提前, 中號

回答

10

你可以用它來獲取數組中的唯一行csv文件

File.readlines("file.csv").uniq 
=> ["350 lbs., Outrigger Footprint, 61\" x 53\", Weight, 767 lbs., 300-2080\n", "350 lbs., Outrigger Footprint, 61\" x 53\", Weight, 817 lbs., 300-2580\n", "350 lbs., Outrigger Footprint, 69\" x 61\", Weight, 867 lbs., 300-3080\n"] 

將其寫入到一個新的文件,你可以以寫模式打開一個文件,寫入到文件這樣的:

File.open("new_csv", "w+") { |file| file.puts File.readlines("csv").uniq } 

爲了比較值,你可以上使用split功能 「」 訪問這樣的每一列:

rows = File.readlines("csv").map(&:chomp) # equivalent to File.readlines.map { |f| f.chomp } 
mapped_columns = rows.map { |r| r.split(",").map(&:strip) } 
=> [["350 lbs.", " Outrigger Footprint", " 61\" x 53\"", " Weight", " 767 lbs.", " 300-2080"], ["350 lbs.", " Outrigger Footprint", " 61\" x 53\"", " Weight", " 817 lbs.", " 300-2580"], .....] 
mapped_columns[0][5] 
=> "300-2080" 

如果您需要更多功能,最好安裝FasterCSV gem

+2

你只需要FasterCSV如果你堅持1.8,1.9的CSV是FasterCSV(有一些改進)。 –

+0

@ mu..yes..u r right – rubyprince

+0

我在使用FasterCSV,但仍然可以使用.uniq嗎? – MarkL

0

嗯,我不認爲這個例子中會得到你正在尋找...答案,但是這會工作...

tmp.txt =>

350 lbs., Outrigger Footprint, 61" x 53", Weight, 767 lbs., 300-2080 
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580 
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580 
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080 
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080 

File.readlines('tmp.txt').uniq將返回此:

350 lbs., Outrigger Footprint, 61" x 53", Weight, 767 lbs., 300-2080 
350 lbs., Outrigger Footprint, 61" x 53", Weight, 817 lbs., 300-2580 
350 lbs., Outrigger Footprint, 69" x 61", Weight, 867 lbs., 300-3080 

所以,你也可以輕鬆地使用Array fxns進行排序。谷歌紅寶石陣列,我相信你可以學習如何選擇,如果你想要一個條目根據比較期望的字符串。

0

你也可以創建一個不允許重複記錄作爲條目的散列。 例如,下面的代碼應該有所幫助:

require 'optparse' 
require 'csv' 
require 'pp' 

options = Hash.new 

OptionParser.new do |opts| 
    opts.banner = "Usage: remove_extras.rb [options] file1 ..." 

    options[:input_file] = '' 
    opts.on('-i', '--input_file FILENAME', 'File to have extra rows removed') do |file| 
     options[:input_file] = file 
    end 

end.parse! 
if File.exists?(options[:input_file]) 
    p "Parsing: #{options[:input_file]}" 
     UniqFile=Hash.new  
     File.open(options[:input_file]).each do |row| 
     UniqFile.store(row,row.hash)     
end 
puts "please enter the output filename: \n" 
aFile=File.open(gets.chomp, "a+") 
UniqFile.each do|key,value| 
aFile.syswrite("#{key}") 
end 

end