2016-04-15 218 views
2

我試圖將一列熊貓數據框轉換爲因子,因爲我試圖在R中調用的函數預計因子。將Pandas Dataframe列轉換爲R因子

pandas2ri.activate()  
#second column of labels has to be converted to factors 
labels = read_csv(path_to_csv) 
as_factor = ro.r['as.factor'] 
output = package.function(another_df, as_factor(labels['column_name'])) 

以下是錯誤我得到:

rpy2.rinterface.RRuntimeError: Error in sort.list(y) : 'x' must be atomic for 'sort.list' 
Have you called 'sort' on a list? 

我該怎麼辦?

重現下面的例子:

import pandas as pd 

df = pd.DataFrame({'Col': [10, 20], 
        'x': ['Control', 'Low_Cav02']}) 

from rpy2 import robjects as ro 

from rpy2.robjects import pandas2ri 
pandas2ri.activate() 

as_factor = ro.r['as.factor'] 

labels = as_factor(df['Col']) 
print labels 

labels = as_factor(df['x']) 
print labels 

輸出:

[1] 10 20 
Levels: 10 20 

/Users/swetabh/Envs/damet/lib/python2.7/site-packages/rpy2/robjects/functions.py:106: UserWarning: Error in sort.list(y) : 'x' must be atomic for 'sort.list' 
Have you called 'sort' on a list? 

    res = super(Function, self).__call__(*new_args, **new_kwargs) 
Traceback (most recent call last): 
    File "damet/analysis.py", line 26, in <module> 
    labels = as_factor(df['x']) 
    File "/Users/swetabh/Envs/damet/lib/python2.7/site-packages/rpy2/robjects/functions.py", line 178, in __call__ 
    return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs) 
    File "/Users/swetabh/Envs/damet/lib/python2.7/site-packages/rpy2/robjects/functions.py", line 106, in __call__ 
    res = super(Function, self).__call__(*new_args, **new_kwargs) 
rpy2.rinterface.RRuntimeError: Error in sort.list(y) : 'x' must be atomic for 'sort.list' 
Have you called 'sort' on a list? 
+0

可以嘗試顯示重複的例子,我們可以運行,以及幫你? –

+1

我不知道它是否可以解決你的問題,但R的因子相當於熊貓類:'df [「some_column」]。astype(「category」)' – ayhan

+0

@MathieuB完成。如果這有幫助的話。 – Swetabh

回答

1

這是工作在我結束就好了。您正在使用哪個版本的rpy2

編輯:原單如下回答 - 我誤解了這個問題

如果試圖建立的R DataFrame,默認的轉換器在rpy2反過來Python列表爲R列表。 如果你想要一個R向量,使用向量的構造函數。

你的榜樣,這可能是這樣的:

df = ro.DataFrame({'Col': ro.vectors.IntVector([10, 20]), 
        'x': ro.vectors.StrVector(['Control', 'Low_Cav02'])}) 
+0

我這樣做時出現以下錯誤:ValueError:如果使用所有標量值,則必須通過索引 – Swetabh

+0

是的。我以某種方式設法誤讀了這個問題,並在答案中寫入了非工作代碼。我正在編輯答案。 – lgautier

相關問題