2014-06-19 78 views
0

我是新來的Python,但我想對一些csv文件做一些數據分析。我想從只包含一些關鍵字的csv文件打印行。我使用第一個塊來打印所有有效的行。從這些行我想打印包括關鍵字的。謝謝你的幫助。包含指定關鍵字的csv文件的打印行

csv.field_size_limit(sys.maxsize) 
invalids = 0 
valids = 0 
for f in ['1.csv']: 
    reader = csv.reader(open(f, 'rU'), delimiter='|', quotechar='\\') 
    for row in reader: 
     try: 
      print row[2] 
      valids += 1 
     except: 
      invalids += 1 
print 'parsed %s records. ignored %s' % (valids, invalids) 

隨着關鍵字:

for w in ['ford', 'hyundai','honda', 'jeep', 'maserati','audi','jaguar', 'volkswagen','chevrolet','chrysler']: 

我想我需要一個if語句來過濾我的前代碼,但我一直在這個掙扎了幾個小時,似乎無法得到它工作。

+0

在其列你想搜索的關鍵字? –

+0

該文件是單列的CSV(所以第一個)。謝謝 – user133474

+0

所以你根本不需要'csv'模塊。 –

回答

0

你的猜測是正確的。你所需要做的就是用if語句過濾行,檢查每個字段是否與關鍵字匹配。這裏是你如何做到這一點(我也做了一些改進你的代碼,並在評論中解釋它們。):

# First, create a set of the keywords. Sets are faster than a list for 
# checking if they contain an element. The curly brackets create a set. 
keywords = {'ford', 'hyundai','honda', 'jeep', 'maserati','audi','jaguar', 
      'volkswagen','chevrolet','chrysler'} 
csv.field_size_limit(sys.maxsize) 
invalids = 0 
valids = 0 
for filename in ['1.csv']: 
    # The with statement in Python makes sure that your file is properly closed 
    # (automatically) when an error occurs. This is a common idiom. 
    # In addition, CSV files should be opened only in 'rb' mode. 
    with open(filename, 'rb') as f: 
     reader = csv.reader(f, delimiter='|', quotechar='\\') 
     for row in reader: 
      try: 
       print row[2] 
       valids += 1 
      # Don't use bare except clauses. It will catch 
      # exceptions you don't want or intend to catch. 
      except IndexError: 
       invalids += 1 
      # The filtering is done here. 
      for field in row: 
       if field in keywords: 
        print row 
        break 
# Prefer the str.format() method over the old style string formatting. 
print 'parsed {0} records. ignored {1}'.format(valids, invalids) 
相關問題