2017-05-09 35 views
2

我有一個熊貓數據幀是這樣的:Pandas DF有一個列表。如何重複此列表的每個值的行?

title author    year type 
0 t1  a1     1980 article 
1 t2  ['a2', 'a3', 'a4'] 1983 article 
2 t3  a5     1982 article 
3 t4  a6     1977 article 
4 t5  ['a7','a8']   2011 book 

這是一個簡短的例子,原來是更加大。

我需要這樣一個數據幀:

title author year type 
0 t1  a1  1980 article 
1 t2  a2  1983 article 
2 t2  a3  1983 article 
3 t2  a4  1983 article 
4 t3  a5  1982 article 
5 t4  a6  1977 article 
6 t5  a7  2011 book 
7 t5  a8  2011 book 

注意,名單有不同數量的元素

+0

的可能的複製http://stackoverflow.com/questions/27263805/pandas- when-cell-contents-are-lists-create-a-row-for-each-element-in-the-list – bigbounty

回答

1
#Expand the list of authors to separate rows and build a authors df 
df_author = df.author.apply(pd.Series).stack().rename('author').reset_index() 

#join the authors df to the original df 
pd.merge(df_author,df,left_on='level_0',right_index=True, suffixes=(['','_old']))[df.columns] 

Out[184]: 
    title author year  type 
0 t1  a1 1980 article 
1 t2  a2 1983 article 
2 t2  a3 1983 article 
3 t2  a4 1983 article 
4 t3  a5 1982 article 
5 t4  a6 1977 article 
6 t5  a7 2011 article 
+0

不能正常工作。結果與第一個DF(帶有列表)相同 – IvanMarkus

+0

我認爲在創建數據框時,作者列中的列表元素不會像列表一樣被解釋。數據框是用df = pd.read_csv('./ file.csv',names = ['title','author','year','type'],header = 0,sep =';',low_memory =假)來自csv。因爲你的解決方案不起作用。我能做什麼? – IvanMarkus

相關問題