2017-03-17 37 views
0

我有數據框,我想將列表提取到另一列。通過在Python中提取元素創建新列

df = pd.DataFrame({"Q007_A00":["Y","Y","Y","Y","Y"], 
       "Q007_B00": ["N","N","N","N","N"], 
       "Q007_C01": [1,4,5,2,"8,3"], 
       "Q007_C02": ["Text 1","Text 2","Text 3,Text 4,Text 5","Text 4","Text 5,Text 6"]}) 

    Q007_A00 Q007_B00 Q007_C01 Q007_C02 
0 Y   N   1   Text 1 
1 Y   N   4   Text 2 
2 Y   N   5   Text 3,Text 4,Text 5 
3 Y   N   2   Text 4 
4 Y   N   8,3  Text 5,Text 6 

輸出將被

Q007_A00 Q007_B00 Q007_C01 Q007_C01_1 Q007_C02 Q007_C02_1 Q007_C02_2 
Y   N   1  0  Text 1 0   0 
Y   N   4  0  Text 2 0   0 
Y   N   5  0  Text 3 Text 4  Text 5 
Y   N   2  0  Text 4 0   0 
Y   N   8  3  Text 5 Text 6  0 

列名將由1

回答

2

添加您可以使用concatlist comprehensionsplit

df = pd.concat([df[x].astype(str).str.split(',', expand=True) for x in df], 
       axis=1, 
       keys=df.columns).fillna(0) 

MultiIndex列可以被刪除通過list comprehension

df.columns = ['{}_{}'.format(col[0], col[1]) for col in df.columns] 
print (df) 
    Q007_A00_0 Q007_B00_0 Q007_C01_0 Q007_C01_1 Q007_C02_0 Q007_C02_1 Q007_C02_2 
0   Y   N   1   0  Text 1   0   0 
1   Y   N   4   0  Text 2   0   0 
2   Y   N   5   0  Text 3  Text 4  Text 5 
3   Y   N   2   0  Text 4   0   0 
4   Y   N   8   3  Text 5  Text 6   0 

但是,如果需要從列名中_0

df.columns = ['{}{}'.format(col[0], '' if col[1] == 0 else '_' + str(col[1])) 
                     for col in df.columns] 
print (df) 
    Q007_A00 Q007_B00 Q007_C01 Q007_C01_1 Q007_C02 Q007_C02_1 Q007_C02_2 
0  Y  N  1   0 Text 1   0   0 
1  Y  N  4   0 Text 2   0   0 
2  Y  N  5   0 Text 3  Text 4  Text 5 
3  Y  N  2   0 Text 4   0   0 
4  Y  N  8   3 Text 5  Text 6   0