2013-09-26 47 views
7
數據幀某些

R datasets可以loaded into a Pandas DataFrame or Panel很容易:負載高維的R數據集成熊貓

import pandas.rpy.common as com 
infert = com.load_data('infert') 
print(infert.head()) 

這似乎只要將R數據集的維數是< = 3。高維數據集打印工作錯誤消息:在rpy/common.py_convert_array功能

In [67]: com.load_data('Titanic') 
Cannot handle dim=4 

此錯誤消息發起。

當然,Pandas無法直接將4維矩陣變成DataFrame或Panel,但是有沒有辦法將Titanic這樣的數據集加載到DataFrame中(可能帶有分層索引)?

+1

'第一melt'它在R,然後加載它...? – joran

+0

@joran:謝謝,我認爲這是有效的! – unutbu

回答

1

With Pandas version 0.13.0 or newerpandas.rpy.common.load_data可以裝載高維數據集如Titanic

import pandas.rpy.common as com 
df = com.load_data('Titanic') 
print(df.head()) 

產生

Survived Age  Sex Class value 
0  No Child Male 1st 0.0 
1  No Child Male 2nd 0.0 
2  No Child Male 3rd 35.0 
3  No Child Male Crew 0.0 
4  No Child Female 1st 0.0 
7

使用@ joran的非常有益的建議,與

% sudo R 
R> install.packages('reshape') 

安裝reshape包後我設法向Titanic數據集加載到一個熊貓數據幀具有:

import pandas as pd 
import pandas.rpy.common as com 
import rpy2.robjects as ro 

r = ro.r 
r('library(reshape)') 
df = com.convert_robj(r('melt(Titanic)')) 
print(df.head()) 

其印刷

Class  Sex Age Survived value 
1 1st Male Child  No  0 
2 2nd Male Child  No  0 
3 3rd Male Child  No  35 
4 Crew Male Child  No  0 
5 1st Female Child  No  0 
+2

很高興工作。僅供參考,**重塑**是舊版本。可能值得使用** reshape2 **來代替。 – joran