2016-01-25 61 views
1

我正在嘗試編寫一個python程序來清理來自CSV文件的調查數據。 我想轉儲包含一系列空白字段的行,如以下示例中的第一行和第三行。轉儲包含空白字段序列的CSV文件的行

"1","a","b","c",,,,, 
"2","a","b","c","d","e","f",,"h" 
"3","a","b","c",,,,, 
"4","a","z","u","d","i","f","x","h" 
"5","d","c","c",,"c","f","g","z" 

關注我的成功代碼:

import csv 

fname = raw_input("Enter input file name: ") 
if len(fname) < 1 : fname = "survey.csv" 

foutput = raw_input("Enter output file name: ") 
if len(foutput) < 1 : foutput = "output_"+fname 


input = open(fname, 'rb') 
output = open(foutput, 'wb') 


searchFor = 5*[''] 

writer = csv.writer(output) 

for row in csv.reader(input): 
    if searchFor not in row : 
     writer.writerow(row) 

input.close() 
output.close() 

回答

0

如何

# change this to whatever a blank item is from the csv reader 
# probably "" or None 
blank_item = None 

for row in csv.reader(input): 
    # filter out all blank elements 
    blanks = [x for x in row if x == blank_item] 
    if len(blanks) < 5: 
     writer.writerow(row) 

這將算連續空白的數量,並讓您根據需要放置它們。

+0

我必須先將行轉換爲字符串才能工作。儘管如此,這種方法仍然沒有確定空白字段。 'print blanks'返回一個空字符串。 – Anjo

+0

空格將爲空,它是該行中所有空白項目的列表。然後這些被計算用於過濾。 – timlyo

1

使用counter檢查,如果一個列表是另一個子集如下。如果要刪除空元素就用Noneboollen過濾空白和丟棄他們 -

import csv 
from itertools import repeat 
from collections import Counter 
input = open(fname, 'rb') 
output = open(foutput, 'wb') 

writer = csv.writer(output) 
#Helper function 
def counterSubset(list1, list2): 
    c1, c2 = Counter(list1), Counter(list2) 
    for k, n in c1.items(): 
     if n > c2[k]: 
      return False 
    return True 
for row in csv.reader(input): 
    if not counterSubset(list(repeat('',5)),row):# i used 5 for five '' you can change it 
     writer.writerow(row)#use filter(None,row) or filter(bool,row) or filter(len,row) to remove empty elements 
input.close() 
output.close() 

輸出 -

1,a,b,c,, 
2,a,b,c,d,e,f,g,h 
4,a,,z,u,d,i,f,x,h 
5,d,c,c,d,c,f,g,z 
+0

這將過濾空白,但不是包含它們的行。我正在尋找一個解決方案,在這個例子中只有第2,4,5行被寫入 – Anjo

+0

確定編輯了這個問題.. – SIslam

+0

我正在尋找一個解決方案,它適用於空白序列,因爲在調查數據有效行可以還包含一些空白字段。我編輯了我的示例,但其編排不夠清晰。 – Anjo

相關問題