2014-02-20 27 views
0

我有一個包含數據行的文件。每行以一個id開頭,後面跟着由逗號分隔的一組固定屬性。如果在Python中爲具有相同ID的行找到匹配項,則從文件中獲取值

123,2,kent,..., 
123,2,bob,..., 
123,2,sarah,..., 
123,8,may,..., 

154,4,sheila,..., 
154,4,jeff,..., 

175,3,bob,..., 

249,2,jack,..., 
249,5,bob,..., 
249,3,rose,..., 

如果條件符合,我想獲得一個屬性。條件是如果'bob'出現在同一個id中,則獲取後面的第二個屬性的值。

For example: 

id: 123 
values returned: 2, 8 

id: 249 
values returned: 3 

Java有一個雙循環,我可以使用,但我想試試這在Python。任何建議都會很棒。

+0

爲什麼ID 249的值'是returned''3'代替'2,5,3'? – aIKid

+0

啊等待我看.. – aIKid

回答

1

我想出了與使用groupbydropwhile一個(也許)更Python的解決方案。這個方法與下面的方法產生的結果相同,但我認爲它更漂亮。:)標誌,「curr_id」和類似的東西不是很pythonic,如果可能的話應該避免!

import csv 
from itertools import groupby, dropwhile 

goal = 'bob' 
ids = {} 

with open('my_data.csv') as ifile: 
    reader = csv.reader(ifile) 
    for key, rows in groupby(reader, key=lambda r: r[0]): 
     matched_rows = list(dropwhile(lambda r: r[2] != goal, rows)) 
     if len(matched_rows) > 1: 
      ids[key] = [row[1] for row in matched_rows[1:]] 

print ids 

(下面第一溶液)

from collections import defaultdict 
import csv 

curr_id = None 
found = False 
goal = 'bob' 
ids = defaultdict(list) 

with open('my_data.csv') as ifile: 
    for row in csv.reader(ifile): 
     if row[0] != curr_id: 
      found = False 
      curr_id = row[0] 
     if found: 
      ids[curr_id].append(row[1]) 
     elif row[2] == goal: 
      found = True 

print dict(ids) 

輸出:

{'123': ['2', '8'], '249': ['3']} 
+0

+1啊,這比我的答案更好,但它有一個空的列表175 – bernie

+0

@bernie謝謝男人,但我不滿意這種解決方案 - 我不喜歡使用標誌,curr_id和東西... :) –

+0

@bernie我發現另一個解決方案,如果它感興趣:) –

0

只需設置標誌或東西,你遍歷:

name = 'bob' 
id = '123' 
found = False 

for line in file: 
    l = line.split(',') 
    if l[0] == id: 
     if l[2] == name: 
      found = True 
     if found: 
      print l[1] 
0
import csv, collections as co, cStringIO as StringIO 

s = '''123,2,kent,..., 
123,2,bob,..., 
123,2,sarah,..., 
123,8,may,..., 
154,4,sheila,..., 
154,4,jeff,..., 
175,3,bob,..., 
249,2,jack,..., 
249,5,bob,..., 
249,3,rose,...,''' 

filelikeobject = StringIO.StringIO(s) 
dd = co.defaultdict(list) 
cr = csv.reader(filelikeobject) 
for line in cr: 
    if line[2] == 'bob': 
    dd[line[0]]; continue 
    if line[0] in dd: 
    dd[line[0]].append(line[1]) 

結果:

>>> dd 
defaultdict(<type 'list'>, {'175': [], '123': ['2', '8'], '249': ['3']}) 
相關問題