如何在大熊貓數據幀

申請SciPy的功能，我有以下的數據幀：如何在大熊貓數據幀

import pandas as pd 
import io 
from scipy import stats 

temp=u"""probegenes,sample1,sample2,sample3 
1415777_at Pnliprp1,20,0.00,11 
1415805_at Clps,17,0.00,55 
1415884_at Cela3b,47,0.00,100""" 
df = pd.read_csv(io.StringIO(temp),index_col='probegenes') 
df

它看起來像這樣

     sample1 sample2 sample3 
probegenes 
1415777_at Pnliprp1  20  0  11 
1415805_at Clps   17  0  55 
1415884_at Cela3b   47  0  100

我想要做的是執行過什麼row-zscore calculation using SCIPY。使用此代碼我得到：

In [98]: stats.zscore(df,axis=1) 
Out[98]: 
array([[ 1.18195176, -1.26346568, 0.08151391], 
     [-0.30444376, -1.04380717, 1.34825093], 
     [-0.04896043, -1.19953047, 1.2484909 ]])

我怎樣才能方便地連接列和索引名回再次到該結果呢？

在一天結束時。它會像：

       sample1 sample2 sample3 
probegenes 
1415777_at Pnliprp1  1.18195176, -1.26346568, 0.08151391 
1415805_at Clps   -0.30444376, -1.04380717, 1.34825093 
1415884_at Cela3b  -0.04896043, -1.19953047, 1.2484909

來源

2016-03-10 neversaint

不能這樣做'S = pd.DataFrame（stats.zscore（DF，軸= 1），指數= df.index，列= df.columns） '？ – EdChum

的documentation for pd.DataFrame有：

數據：numpy的ndarray（結構化或同質），字典，或數據幀快譯通可以包含系列，數組常量或類似列表的對象索引：索引或類似數組用於結果幀的索引。如果沒有輸入數據的索引信息部分並且沒有提供索引，則默認爲np.arange（n）列：索引或類似數組用於結果幀的列標籤。將默認爲np.arange（N）如果提供

所以沒有列標籤，

pd.DataFrame(
    stats.zscore(df,axis=1), 
    index=df.index, 
    columns=df.columns)

應該做的工作。

來源

2016-03-10 09:41:51

你不需要scipy。您可以使用lambda函數做到這一點：

>>> df.apply(lambda row: (row - row.mean())/row.std(ddof=0), axis=1) 
         sample1 sample2 sample3 
probegenes          
1415777_at Pnliprp1 1.181952 -1.263466 0.081514 
1415805_at Clps  -0.304444 -1.043807 1.348251 
1415884_at Cela3b -0.048960 -1.199530 1.248491

來源

2016-03-10 09:58:17 Alexander

如何在大熊貓數據幀

回答

相關問題