2013-12-21 53 views
1

我有一個數據集,其中包括它們如何帶小數點的因素轉換成數值

> str(gdp) 
'data.frame': 64 obs. of 31 variables: 
$ 1 : Factor w/ 62 levels "","1,145.31",..: 1 1 1 53 16 20 22 24 30 32 ... 
$ 2 : Factor w/ 64 levels "1,121.93","1,264.63",..: 42 59 10 13 18 16 17 23 25 35 ... 
$ 3 : Factor w/ 62 levels "","1,072.07",..: 1 1 1 35 36 39 41 42 45 51 ... 
$ 4 : Factor w/ 62 levels "","1,076.03",..: 1 1 1 15 16 21 23 26 27 36 ... 
$ 5 : Factor w/ 62 levels "","1,023.09",..: 1 1 1 11 15 19 17 23 21 27 ... 
$ 6 : Factor w/ 62 levels "","1,003.81",..: 1 1 1 40 45 46 47 52 56 7 ... 
$ 7 : Factor w/ 62 levels "","1,137.23",..: 1 1 1 13 15 19 21 23 24 28 ... 
$ 8 : Factor w/ 62 levels "","1,198.30",..: 1 1 1 26 31 34 35 39 40 47 ... 
$ 9 : Factor w/ 64 levels "1,114.32","1,519.23",..: 27 30 36 41 49 51 50 54 56 64 ... 
$ 10: Factor w/ 62 levels "","1,208.85",..: 1 1 1 35 39 40 42 45 46 53 ... 
$ 11: Factor w/ 64 levels "","1,089.33",..: 1 11 17 20 23 24 26 29 31 37 ... 
$ 12: Factor w/ 62 levels "","1,037.14",..: 1 1 1 22 23 25 31 30 36 41 ... 
$ 13: Factor w/ 63 levels "","1,114.20",..: 1 63 1 8 11 12 14 20 22 27 ... 
$ 14: Factor w/ 64 levels "1,169.73","1,409.74",..: 63 12 14 16 17 22 24 25 28 30 ... 
$ 15: Factor w/ 62 levels "","1,117.66",..: 1 1 1 33 35 39 40 44 43 53 ... 
$ 16: Factor w/ 63 levels "","1,045.73",..: 21 1 1 30 35 38 41 42 47 50 ... 
$ 17: Factor w/ 62 levels "","1,088.39",..: 1 1 1 24 32 26 34 38 40 48 ... 
$ 18: Factor w/ 62 levels "","1,244.71",..: 1 1 1 24 30 31 33 34 38 44 ... 
$ 19: Factor w/ 62 levels "","1,155.37",..: 1 1 1 25 34 37 38 41 44 48 ... 
$ 20: Factor w/ 64 levels "","1,198.29",..: 1 63 8 11 15 17 18 20 26 30 ... 
$ 21: Factor w/ 36 levels "","1,065.67",..: 1 1 1 1 1 1 1 1 1 1 ... 
$ 22: Factor w/ 64 levels "1,123.06","1,315.12",..: 12 14 15 17 22 23 24 26 27 40 ... 
$ 23: Factor w/ 62 levels "","1,016.31",..: 1 1 1 22 25 31 33 38 43 49 ... 
$ 24: Factor w/ 64 levels "1,029.92","1,133.27",..: 52 53 57 60 6 8 9 12 13 22 ... 
$ 25: Factor w/ 64 levels "1,222.15","1,517.69",..: 60 62 7 8 12 14 15 21 22 25 ... 
$ 26: num NA NA 1.29 1.32 1.36 1.39 1.43 1.62 1.56 1.72 ... 
$ 27: Factor w/ 62 levels "","1,036.85",..: 1 1 1 12 16 21 22 27 25 33 ... 
$ 28: Factor w/ 61 levels "","1,052.88",..: 1 1 1 12 13 17 18 24 23 26 ... 
$ 29: Factor w/ 64 levels "1,018.62","1,081.27",..: 6 7 8 9 10 26 27 34 35 43 ... 
$ 30: Factor w/ 62 levels "","1,203.92",..: 1 1 1 6 5 21 22 23 24 32 ... 
$ 31: Factor w/ 62 levels "","1,039.85",..: 1 1 1 57 59 9 11 13 14 16 ... 

我想保留的所有信息(小數點)因素向量和把所有的載體向數字。到目前爲止,我已經試過要讓這些載體爲字符,然後以數字,這是在SO建議,但我得到

> gdp<-data.frame(lapply(gdp,as.character)) 
> gdp<-data.frame(lapply(gdp,as.numeric)) 
> str(gdp) 
'data.frame': 64 obs. of 31 variables: 
$ X1 : num 1 1 1 53 16 20 22 24 30 32 ... 
$ X2 : num 42 59 10 13 18 16 17 23 25 35 ... 
$ X3 : num 1 1 1 35 36 39 41 42 45 51 ... 
$ X4 : num 1 1 1 15 16 21 23 26 27 36 ... 
$ X5 : num 1 1 1 11 15 19 17 23 21 27 ... 
$ X6 : num 1 1 1 40 45 46 47 52 56 7 ... 
$ X7 : num 1 1 1 13 15 19 21 23 24 28 ... 
$ X8 : num 1 1 1 26 31 34 35 39 40 47 ... 
$ X9 : num 27 30 36 41 49 51 50 54 56 64 ... 
$ X10: num 1 1 1 35 39 40 42 45 46 53 ... 
$ X11: num 1 11 17 20 23 24 26 29 31 37 ... 
$ X12: num 1 1 1 22 23 25 31 30 36 41 ... 
$ X13: num 1 63 1 8 11 12 14 20 22 27 ... 
$ X14: num 63 12 14 16 17 22 24 25 28 30 ... 
$ X15: num 1 1 1 33 35 39 40 44 43 53 ... 
$ X16: num 21 1 1 30 35 38 41 42 47 50 ... 
$ X17: num 1 1 1 24 32 26 34 38 40 48 ... 
$ X18: num 1 1 1 24 30 31 33 34 38 44 ... 
$ X19: num 1 1 1 25 34 37 38 41 44 48 ... 
$ X20: num 1 63 8 11 15 17 18 20 26 30 ... 
$ X21: num 1 1 1 1 1 1 1 1 1 1 ... 
$ X22: num 12 14 15 17 22 23 24 26 27 40 ... 
$ X23: num 1 1 1 22 25 31 33 38 43 49 ... 
$ X24: num 52 53 57 60 6 8 9 12 13 22 ... 
$ X25: num 60 62 7 8 12 14 15 21 22 25 ... 
$ X26: num NA NA 1 2 3 4 5 7 6 8 ... 
$ X27: num 1 1 1 12 16 21 22 27 25 33 ... 
$ X28: num 1 1 1 12 13 17 18 24 23 26 ... 
$ X29: num 6 7 8 9 10 26 27 34 35 43 ... 
$ X30: num 1 1 1 6 5 21 22 23 24 32 ... 
$ X31: num 1 1 1 57 59 9 11 13 14 16 ... 

不保留所有的小數點,然後在空白的來港不補。我也試過

> gdp<-as.numeric(levels(gdp))[gdp] 
Error in as.numeric(levels(gdp))[gdp] : invalid subscript type 'list' 

有沒有辦法將矢量變成數字?

+0

非常感謝你,我不知道錯誤是因爲逗號分隔符。但是,當我嘗試做as.numeric時,我得到錯誤無效下標類型'列表'。 – song0089

回答

0

讓我們來分解一下。

首先,因爲gdp是數據幀,levels將返回NULL。您可能需要在gdp的每列上查找levels的輸出。在這種情況下,您想使用lapply之類的東西。

levels(gdp) 
# NULL 
lapply(gdp, levels) 
# this output will make sense 
as.numeric(levels(gdp))[gdp] 
# this will make no sense 

的錯誤,說明你不能用一個列表(gdp)到下標向量。

要遍歷gdp的列,您需要類似lapply的東西來處理每個組件。

gdp <- data.frame(lapply(gdp, function(x) { 
    if(!is.factor(x)) x 
    else as.numeric(gsub(",","",levels(x),fixed=TRUE))[x] 
})) 

可能您的數據集將更好地作爲矩陣使用,因爲它似乎是所有類型的數字。在這種情況下:

gdp <- as.matrix(gdp) 
+0

這工作。非常感謝! – song0089