2015-09-03 46 views
0

我有以下格式的數據(CSV文件):組合多個值,在Python

id, review 
1, the service was great! 
1, staff was friendly. 
2, nice location 
2, but the place was not clean 
2, the motel was okay 
3, i wouldn't stay there next time 
3, do not stay there 

我想數據更改爲以下格式:

1, the service was great! staff was friendly. 
2, nice location but the place was not clean the motel was okay 
3, i wouldn't stay there next time do not stay there 

任何幫助將不勝感激。

+0

你有什麼迄今所做的:讀取該文件假設它是一個真正的CSV文件,與,分隔符的代碼?由於最後一行不是以'1'開始,而是在之前被添加到行中,所以匹配標準是什麼? – albert

+0

看看'itertools.groupby'。 – Kevin

+0

@albert我糾正了輸出。 – kevin

回答

1

您可以使用itertools.groupby來分組具有相同編號的連續條目。

import itertools, operator, csv 
with open("test.csv") as f: 
    reader = csv.reader(f, delimiter=",") 
    next(reader) # skip header line 
    for key, group in itertools.groupby(reader, key=operator.itemgetter(0)): 
     print key, ' '.join(g[1] for g in group) 

輸出:

1 the service was great! staff was friendly. 
2 nice location but the place was not clean the motel was okay 
3 i wouldn't stay there next time do not stay there 

注:

id, review 
1, the service was great! 
... 
+0

這正是我正在尋找的。 – kevin