2011-09-04 23 views
6

我有一個CSV文件,當我使用這個命令如何將包含十進制數字的因子列轉換爲數字?

SOLK<-read.table('Book1.csv',header=TRUE,sep=';') 

我得到這個輸出

> SOLK 
      Time Close Volume 
1 10:27:03,6 0,99 1000 
2 10:32:58,4 0,98 100 
3 10:34:16,9 0,98 600 
4 10:35:46,0 0,97 500 
5 10:35:50,6 0,96  50 
6 10:35:50,6 0,96 1000 
7 10:36:10,3 0,95  40 
8 10:36:10,3 0,95 100 
9 10:36:10,4 0,95 500 
10 10:36:10,4 0,95 100 
.  .   .  . 
.  .   .  . 
.  .   .  . 
285 17:09:44,0 0,96 404 

str(SOLK)結果這

'data.frame': 285 obs. of 3 variables: 
$ Time : Factor w/ 174 levels "10:27:03,6","10:32:58,4",..: 1 2 3 4 5 5 6 6 7 7 ... 
$ Close : Factor w/ 8 levels "0,92","0,93",..: 8 7 7 6 5 5 4 4 4 4 ... 
$ Volume: int 1000 100 600 500 50 1000 40 100 500 100 ... 

,並和dput(SOLK[1:10,])

structure(list(Time = structure(c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 
6L, 7L, 7L), .Label = c("10:27:03,6", "10:32:58,4", "10:34:16,9", 
"10:35:46,0", "10:35:50,6", "10:36:10,3", "10:36:10,4", "10:36:30,8", 
"10:37:23,3", "10:37:38,2", "10:37:39,3", "10:37:45,9", "10:39:07,5", 
"10:39:07,6", "10:39:46,6", "10:41:21,8", "10:43:20,6", "10:43:36,4", 
"10:43:48,8", "10:43:48,9", "10:43:54,6", "10:44:01,5", "10:44:08,4", 
"10:45:47,2", "10:46:16,7", "10:47:03,6", "10:47:48,6", "10:47:55,0", 
"10:48:09,9", "10:48:30,6", "10:49:20,6", "10:50:31,9", "10:50:34,6", 
"10:50:38,1", "10:51:02,8", "10:51:11,5", "10:55:57,7", "10:57:57,2", 
"10:59:06,9", "10:59:33,5", "11:00:31,0", "11:00:31,1", "11:04:46,4", 
"11:04:53,4", "11:04:54,6", "11:04:56,1", "11:04:58,9", "11:05:02,0", 
"11:05:02,6", "11:05:24,7", "11:05:56,7", "11:06:15,8", "11:13:24,1", 
"11:13:24,2", "11:13:32,1", "11:13:36,2", "11:13:37,2", "11:13:44,5", 
"11:13:46,8", "11:14:12,7", "11:14:19,4", "11:14:19,8", "11:14:21,2", 
"11:14:38,7", "11:14:44,0", "11:14:44,5", "11:15:10,5", "11:15:10,6", 
"11:15:12,9", "11:15:16,6", "11:15:23,3", "11:15:31,4", "11:15:36,4", 
"11:15:37,4", "11:15:49,5", "11:16:01,4", "11:16:06,0", "11:17:56,2", 
"11:19:08,1", "11:20:17,2", "11:26:39,4", "11:26:53,2", "11:27:39,5", 
"11:28:33,0", "11:30:42,3", "11:31:00,7", "11:33:44,2", "11:39:56,1", 
"11:40:07,3", "11:41:02,1", "11:41:30,1", "11:45:07,0", "11:45:26,6", 
"11:49:50,8", "11:59:58,1", "12:03:49,9", "12:04:12,6", "12:06:05,8", 
"12:06:49,2", "12:07:56,0", "12:09:37,7", "12:14:25,5", "12:14:32,1", 
"12:15:42,1", "12:15:55,2", "12:16:36,9", "12:16:44,2", "12:18:00,3", 
"12:18:12,8", "12:28:17,8", "12:28:17,9", "12:28:23,7", "12:28:51,1", 
"12:36:33,2", "12:37:45,0", "12:39:22,2", "12:40:19,5", "12:42:22,1", 
"12:58:46,3", "13:06:05,8", "13:06:05,9", "13:07:17,6", "13:07:17,7", 
"13:09:01,3", "13:09:01,4", "13:09:11,3", "13:09:31,0", "13:10:07,8", 
"13:35:43,8", "13:38:27,7", "14:11:16,0", "14:17:31,5", "14:26:13,9", 
"14:36:11,8", "14:38:43,7", "14:38:47,8", "14:38:51,8", "14:48:26,7", 
"14:52:07,4", "14:52:13,8", "15:09:24,7", "15:10:25,8", "15:29:12,1", 
"15:31:55,9", "15:34:04,1", "15:44:10,8", "15:45:07,1", "15:57:04,9", 
"15:57:13,9", "16:16:27,9", "16:21:41,7", "16:36:01,5", "16:36:13,2", 
"16:46:10,5", "16:46:10,6", "16:47:37,3", "16:50:52,4", "16:50:52,5", 
"16:51:44,5", "16:55:11,5", "16:56:21,8", "16:56:37,5", "16:57:37,9", 
"16:58:18,6", "16:58:44,5", "17:00:39,1", "17:01:50,7", "17:03:13,2", 
"17:03:28,3", "17:03:46,7", "17:03:47,0", "17:04:30,4", "17:08:41,8", 
"17:09:44,0"), class = "factor"), Close = structure(c(8L, 7L, 
7L, 6L, 5L, 5L, 4L, 4L, 4L, 4L), .Label = c("0,92", "0,93", "0,94", 
"0,95", "0,96", "0,97", "0,98", "0,99"), class = "factor"), Volume = c(1000L, 
100L, 600L, 500L, 50L, 1000L, 40L, 100L, 500L, 100L)), .Names = c("Time", 
"Close", "Volume"), row.names = c(NA, 10L), class = "data.frame") 

如何將SOLK$Close列的因數轉換爲數字?

+0

的可能重複[?如何轉換數據幀與係數列到XTS對象(http://stackoverflow.com/questions/7288045/how-can-i- convert-a-dataframe-with-a-factor-column-to-a-xts-object) –

回答

7
as.numeric(as.character(SOLK$Close)) 

這是在R-FAQ, 7.10

+0

這裏是as.numeric的輸出(as.character(SOLK $ Close)) ' – G0dAreS

+1

'[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ...... [NA] NA NA NA NA NA NA NA NA NA 警告信息: NA強制引入' – G0dAreS

+4

Karl B的回答如下 - 您需要將「,」t o「。」除非您的R設置處於歐元模式,您的錯誤信息似乎並非如此。 –

8

我認爲你的數字有逗號而不是句號,因此你可以撥打read.tabledec=","

+0

感謝您對小數分隔符的觀察 – G0dAreS

13
as.numeric(as.character(sub("," , ".", SOLK$Close))) 

我注意到(你貼一個更好的例子之後),你可能需要做一些轉換的「時間」值,以及:

> SOLK$Close.n <- as.numeric(sub("," , ".", SOLK$Close)) 
> head(SOLK) 
     Time Close Volume Close.n 
1 10:27:03,6 0,99 1000 0.99 
2 10:32:58,4 0,98 100 0.98 
3 10:34:16,9 0,98 600 0.98 
4 10:35:46,0 0,97 500 0.97 
5 10:35:50,6 0,96  50 0.96 
6 10:35:50,6 0,96 1000 0.96 

自認爲也是一個因素,你會如果您完成轉換,則可獲得一般性。也許:

SOLK$Time.n <- as.POSIXct(sub("," , ".", SOLK$Time), format="%H:%M:%S") 
+1

是否需要'sub'上的'as.character'? 'sub'返回字符 – Marek

+0

@Marek。這麼。修整我的代碼。 –

+0

這個解決方案不能修復潛在的千位分隔符'「。」'到'「,」' – altabq

0
SOLK<-read.table('Book1.csv',header=TRUE,sep=';', colClasses = "character") 
SOLK[, position] <- as.numeric(SOLK[, position]) 
相關問題