使用list comprehension
與groupby
:
from itertools import groupby
df = pd.DataFrame({'a':[['Andhra Pradesh-133', 'Meetai-1358', 'Meetai-2146', 'Meetai-2277'],
['Andhra Pradesh-20', 'Rajasthan-60', 'Rajasthan-70']]})
data = []
for x in df['a']:
b = [a.split('-') for a in x]
L = [t for k, g in groupby(b, key=lambda x: x[0])
for t in [k + '-' + str(sum((int(j) for i, j in g)))]]
data.append(L)
print (data)
[['Andhra Pradesh-133', 'Meetai-5781'], ['Andhra Pradesh-20', 'Rajasthan-130']]
df['b'] = data
print (df)
a \
0 [Andhra Pradesh-133, Meetai-1358, Meetai-2146,...
1 [Andhra Pradesh-20, Rajasthan-60, Rajasthan-70]
b
0 [Andhra Pradesh-133, Meetai-5781]
1 [Andhra Pradesh-20, Rajasthan-130]
編輯:
data = []
for line in open('file.csv'):
#strip new-line characters, split by [ and get second list
items = line.strip('\r\n" ]').split('[')[1]
#split lines, remove whitespace
items = [item.strip("' ") for item in items.split(',')]
#split to sublist
items = [a.split('-') for a in items]
#sum splitted sublists
items = [t for k, g in groupby(items, key=lambda x: x[0])
for t in [k + '-' + str(sum((int(j) for i, j in g)))]]
data.append(items)
print (data)
[['Andhra Pradesh-133', 'Meetai-5781'], ['Andhra Pradesh-20', 'Rajasthan-130']]
編輯:如果輸入文件
解決方案:
你需要通過[
首次出現分裂,然後剝離[]
太:
data = []
for line in open('file.csv'):
#strip new-line characters, split by [ and get second list
items = line.strip('\r\n" ]').split('[', 1)[1]
#split lines, remove whitespace
items = [item.strip("'[] ") for item in items.split(',')]
#split to sublist
items = [a.split('-') for a in items]
print (items)
#sum splitted sublists
items = [t for k, g in groupby(items, key=lambda x: x[0])
for t in [k + '-' + str(sum((int(j) for i, j in g)))]]
data.append(items)
有一個小疑問在這裏,如果我考慮的是X = [ '潘吉姆-20', '北方邦-23185',「 Gujurat-1013','Uttar Pradesh-51']聲明函數組似乎不起作用。 b = [a.split(' - ')for a x] for k,g in groupby(b,key = lambda x:x [0]):不會被'uttar Pradesh'分組也不是'uttar Pradesh'一樣。你能幫助我們瞭解什麼是錯過的? –
我覺得有問題double'[['。我編輯答案。 – jezrael
對於我正在嘗試處理的名單中的錯字x = ['panjim-20','Uttar Pradesh-23185','Gujurat-1013','Uttar Pradesh-51']表示歉意。 ? –