pandas：將字符串列轉換爲有序的類別？

我正在與熊貓第一次。我有一個調查答覆專欄，可以採取「非常同意」，「同意」，「不同意」，「非常不同意」和「不同意」的觀點。pandas：將字符串列轉換爲有序的類別？

這是describe()和value_counts()列的輸出：

count  4996 
unique  5 
top  Agree 
freq  1745 
dtype: object 
Agree    1745 
Strongly agree  926 
Strongly disagree  918 
Disagree    793 
Neither    614 
dtype: int64

我想做的事情在這個問題上與總分數的線性迴歸。但是，我有一種感覺，我應該首先將列轉換爲Category變量，因爲它本身是有序的。它是否正確？如果是這樣，我該怎麼做？

我已經試過這樣：

df.EasyToUseQuestionFactor = pd.Categorical.from_array(df.EasyToUseQuestion) 
print df.EasyToUseQuestionFactor

這將產生輸出看起來依稀權利，但似乎類別順序錯誤。有沒有一種方法可以指定排序？我甚至需要指定排序？

這是我的代碼的其他部分目前：

df = pd.read_csv('./data/responses.csv') 
lm1 = ols('OverallScore ~ EasyToUseQuestion', data).fit() 
print lm1.rsquared

來源

2014-09-19 Richard

在這裏看到即將到來的全分類支持（這將在0.15.0，尚未發佈）：http://pandas-docs.github.io/pandas-docs-travis/categorical.html – Jeff 2014-09-19 17:05:40

是的，你應該把它轉換爲分類數據，這應該做的伎倆

likert_scale = {'strongly agree':2, 'agree':1, 'neither':0, 'disagree':-1, 'strongly disagree':-2} 
df['categorical_data'] = df.EasyToUseQuestion.apply(lambda x: likert_scale[x])

來源

2014-09-19 17:42:01

謝謝！我不得不使用'.map'而不是'.apply'，否則這個工作。 – Richard 2014-09-22 14:06:34

pandas：將字符串列轉換爲有序的類別？

回答

相關問題