2017-03-12 48 views
0

我試圖在使用melt()函數將寬轉換爲長格式後,將分類變量轉換爲R中的因子。然而,當我運行因子功能和輸入水平和標籤時,我得到一個表格:R中的生成因子問題

有沒有人知道爲什麼會發生這種情況?

law <- read.csv("lawyers_class_new.csv") 


library(reshape2) 
law <- melt(law, id.vars = c("Subj"), measure.vars = c("lawyer", "neutral", "engineer", "neutral_urb", "neutral_rur")) 
law <- law[order(law$Subj),] 
law <- within(law, 
       Subj <- factor(Subj), 
       variable <- factor(variable) 
      ) 
law$variable<- ordered(law$variable,levels=c(1,2,3,4,5),labels=c("lawyer","neutral", 
    "engineer","neutral_urb","neutral_rur")) 


Output: 

law$variable 
    [1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>  <NA> <NA> <NA> <NA> 
[18] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[35] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[52] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[69] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[86] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[103] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[120] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 
[137] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 

融化的數據幀:

**Subj Cond variable value** 
1   2  lawyer  3 
1   3  neutral  1 
1   1  engineer  3.5 
1   5  neutral_urb 3 
1   4  neutral_rur 3.5 
2   2  lawyer  1 
2   3  neutral  3.5 
2   1  engineer  4.5 
2   5  neutral_urb 2 
2   4  neutral_rur 3.5 

原始數據幀:

Subj lawyer neutral engineer neutral_urb neutral_rur 
1   3  1  3.5   3   3.5 
2   1  3.5  4.5   2   3.5 
+1

請做一個可重現的例子。我們無法訪問lawyers_class_new.csv。 http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example –

+1

在第二次轉換爲有序因子時,似乎水平不是「1:5」。 levels參數應該是*因子級別顯示爲*,只有當您想要將它們更改爲其他內容時,標籤纔是可選的。 – Gregor

+0

另外,我不知道你的目標,但許多人錯誤地認爲,按照特定順序(例如繪圖)來設置一個有序的因子是必要的。事實並非如此。 「有序」因素的唯一原因是建模時使用的對比度。 – Gregor

回答

0

爲了最大限度地減少錯誤,我也不會導入字符列的因素,似乎使用within不爲法律$變量創造適當的因素。因此,我會指定這樣的因素來確保正確的順序。

law <- read.table(text="Subj Cond variable value 
1   2  lawyer  3 
1   3  neutral  1 
1   1  engineer  3.5 
1   5  neutral_urb 3 
1   4  neutral_rur 3.5 
2   2  lawyer  1 
2   3  neutral  3.5 
2   1  engineer  4.5 
2   5  neutral_urb 2 
2   4  neutral_rur 3.5", header=TRUE, stringsAsFactors=FALSE) 

law <- law[order(law$Subj),] 

law$Subj <- as.factor(law$Subj) 
law$variable <- factor(law$variable,levels =c("lawyer","neutral", 
    "engineer","neutral_urb","neutral_rur")) 

str(law) 
'data.frame': 10 obs. of 4 variables: 
$ Subj : Factor w/ 2 levels "1","2": 1 1 1 1 1 2 2 2 2 2 
$ Cond : int 2 3 1 5 4 2 3 1 5 4 
$ variable: Factor w/ 5 levels "lawyer","neutral",..: 1 2 3 4 5 1 2 3 4 5 
$ value : num 3 1 3.5 3 3.5 1 3.5 4.5 2 3.5