快速分離行

我有以下的數據幀：快速分離行

import pandas as pd 
df = pd.DataFrame({'Probes':["1415693_at","1415693_at"], 
        'Genes':["Canx","LOC101056688 /// Wars "], 
        'cv_filter':[ 0.134,0.290], 
        'Organ' :["LN","LV"]} )  
df = df[["Probes","Genes","cv_filter","Organ"]]

它看起來像這樣：

In [16]: df 
Out[16]: 
     Probes     Genes cv_filter Organ 
0 1415693_at     Canx  0.134 LN 
1 1415693_at LOC101056688 /// Wars  0.290 LV

我想要做的就是拆分行基於其中條目的基因列由'///'分隔。

我希望得到的結果是

 Probes     Genes cv_filter Organ 
0 1415693_at     Canx  0.134 LN 
1 1415693_at   LOC101056688  0.290 LV 
2 1415693_at     Wars  0.290 LV

我總共有15萬〜行檢查。有沒有一種快速的方法來處理？

來源

2016-03-03 neversaint

你可以嘗試先str.split列Genes，創造新的Series和join它原來df：

import pandas as pd 
df = pd.DataFrame({'Probes':["1415693_at","1415693_at"], 
        'Genes':["Canx","LOC101056688 /// Wars "], 
        'cv_filter':[ 0.134,0.290], 
        'Organ' :["LN","LV"]} )  
df = df[["Probes","Genes","cv_filter","Organ"]] 
print df 
     Probes     Genes cv_filter Organ 
0 1415693_at     Canx  0.134 LN 
1 1415693_at LOC101056688 /// Wars  0.290 LV 

s = pd.DataFrame([ x.split('///') for x in df['Genes'].tolist() ], index=df.index).stack() 
#or you can use approach from comment 
#s = df['Genes'].str.split('///', expand=True).stack() 

s.index = s.index.droplevel(-1) 
s.name = 'Genes' 
print s 
0    Canx 
1 LOC101056688 
1   Wars 
Name: Genes, dtype: object 

#remove original columns, because error: 
#ValueError: columns overlap but no suffix specified: Index([u'Genes'], dtype='object')  
df = df.drop('Genes', axis=1) 

df = df.join(s).reset_index(drop=True) 
print df[["Probes","Genes","cv_filter","Organ"]] 
     Probes   Genes cv_filter Organ 
0 1415693_at   Canx  0.134 LN 
1 1415693_at LOC101056688  0.290 LV 
2 1415693_at   Wars  0.290 LV

來源

2016-03-03 10:14:38 jezrael

爲什麼不'DF [ '基因'] str.split（ '///'，擴大= True）.stack（）'而不是'df ['Genes']。str.split（'///'）。apply（pd.Series，1）.stack（）'。它快了兩倍 –

@AntonProtopopov - 謝謝。我將它添加到我的答案中作爲替代解決方案（只比DataFrame構造函數慢一點點）。 – jezrael

對於那個解決方案你的's'是沒有多索引的DataFrame .. –

回答

相關問題