2016-05-18 38 views
2

我有以下的熊貓系列:如何系列指數轉換成兩列的數據幀

import pandas as pd 
import io 
from scipy import stats 

test=u"""probegenes,sample1 
1415777_at Pnliprp1,20 
1415884_at Cela3b,47 
1415805_at Clps,17 
1115805_at Ckkk,77 
""" 
df_test = pd.read_csv(io.StringIO(test),index_col='probegenes') 
my_series = df_test['sample1'] 
my_series 

它看起來像這樣:

In [62]: my_series 
Out[62]: 
probegenes 
1415777_at Pnliprp1 20 
1415884_at Cela3b  47 
1415805_at Clps  17 
1115805_at Ckkk  77 
Name: sample1, dtype: int64 

我想要做的是分裂「探針基因的索引,以便我得到新的數據框:

Probe  Genes  Score 
0 1415777_at Pnliprp1 20 
1 1415884_at Cela3b  47 
2 1415805_at Clps  17 
3 1115805_at Ckkk  77 

我該如何做到這一點?

回答

3

您在index可以.str.split(expand=True)轉換爲Series後,和.concat()結果與第一column

df = pd.concat([my_series,my_series.index.to_series().str.split(expand=True)], axis=1).reset_index(drop=True) 
df.rename(columns={'sample1': 'Score', 0: 'probe', 1: 'genes'}) 

產量:

 Score  Probe  Genes 
0  20 1415777_at Pnliprp1 
1  47 1415884_at Cela3b 
2  17 1415805_at  Clps 
3  77 1115805_at  Ckkk 
2
df = pd.DataFrame([i.split(" ") for i in my_series.index], columns=['Probe', 'Genes']) 
df['Score'] = my_series.values 

>>> df 
     Probe  Genes Score 
0 1415777_at Pnliprp1  20 
1 1415884_at Cela3b  47 
2 1415805_at  Clps  17 
3 1115805_at  Ckkk  77