2015-08-31 258 views
2

我是R新手,想讀取csv文件。但是,當我嘗試閱讀時遇到錯誤。 我的csv文件如下:從R讀取csv文件

,Zbot,Sirefef,Fareit,Winwebsec,FakeSysdef,Winwebsec,Winwebsec,Winwebsec,Fareit,Fareit,Sirefef,Winwebsec,Winwebsec,Winwebsec,Winwebsec 
Zbot,0,134,45,651,182,245,986,64,63,34,134,166,52,337,225 
Sirefef,142,0,124,679,200,273,1018,156,125,122,164,198,120,371,257 
Fareit,48,124,0,644,166,234,978,82,64,51,135,167,49,338,224 
Winwebsec,651,499,470,0,575,556,1087,525,490,485,501,511,483,600,582 
FakeSysdef,178,172,143,535,0,311,1052,196,163,152,204,234,154,405,285 
Winwebsec,245,199,168,478,229,0,997,217,186,183,199,209,183,348,272 
Winwebsec,986,752,719,821,784,727,0,774,739,734,750,760,734,851,829 
Winwebsec,80,160,85,506,179,204,757,0,100,85,173,205,89,376,264 
Fareit,65,95,32,468,141,164,715,78,0,57,135,165,59,336,226 
Fareit,52,122,51,468,143,166,717,68,40,0,135,163,49,336,224 
Sirefef,136,118,85,449,150,147,696,123,83,83,0,146,100,317,207 
Winwebsec,164,138,105,449,170,145,696,143,103,103,80,0,118,315,215 
Winwebsec,52,116,51,466,143,166,717,66,42,32,83,103,0,336,226 
Winwebsec,335,267,234,496,301,246,745,272,232,234,213,213,234,0,346 
Winwebsec,225,204,171,519,228,217,774,207,169,169,150,160,169,291,0 

當我在RStudio使用此命令我得到錯誤: 命令:

> tb = read.csv("/home/hossein/Documents/LiClipse Workspace/test.csv", row.names = 1); 

錯誤:

錯誤函數read.table中(文件= file,header = header,sep = sep,quote = quote,: 重複'row.names'不允許

我也試着去掉e的錯誤,並使用此命令:

> tb = read.csv("/home/hossein/Documents/LiClipse Workspace/test.csv", row.names = NULL); 

但是,當我查看輸出是它不能保持方陣的結構。你能幫我,我該怎麼辦?

+1

首先,數據集的第一行以逗號開頭。其次,爲什麼不嘗試使用RStudio加載數據集的GUI功能?如果你這樣做,你可能會有更好的機會來檢測你的數據加載不正確的原因。 – Abdou

回答

0

第一行以空白字段開頭,所以使用skip = 1參數可能會很方便,也許因爲read.csv不能將輸入函數理解爲矩形數組。

1

您可以在幾個簡單的步驟,做法如下:

d = read.csv('path/to/test.csv') # import the data 
row.names(d) = make.unique(as.character(d[, 1])) # create the row names from the first column 
d = d[, -1] # remove the first column now that you don't need it anymore 

這樣可保持你的方陣:

dim(d) # still a 15x15 matrix 
+0

我做了正是你說的,但我有不同的尺寸:[1] 15 16 – Alex

+1

我剛剛檢查了它,我仍然得到15x15與csv文件複製以上... – CephBirk

1

R請勿支持重複的行名字,看到?row.names

All data frames have a row names attribute, a character vector of length the number of rows with no duplicates nor missing values.

鑑於此,您可以導入您的數據,然後使行名稱與函數唯一n make.names。這有點難看,但我認爲它解決了你的問題。用你提供的數據看下面的例子。

table <- read.csv("data.csv", row.names = NULL) 

### Save row names in a separte object: 
rows <- table$X 

### Remove the colunm with row.names: 
table <- table[,-1] 

### Create unique Row.names with the function `make.names` 
rownames(table) <- make.names(rows, unique=TRUE) 

### Check results: 
dim(table) 
#> [1] 15 15 
head(table) 
#>    Zbot Sirefef Fareit Winwebsec FakeSysdef Winwebsec.1 
#> Zbot   0  134  45  651  182   245 
#> Sirefef  142  0 124  679  200   273 
#> Fareit  48  124  0  644  166   234 
#> Winwebsec 651  499 470   0  575   556 
#> FakeSysdef 178  172 143  535   0   311 
#> Winwebsec.1 245  199 168  478  229   0 
#>    Winwebsec.2 Winwebsec.3 Fareit.1 Fareit.2 Sirefef.1 
#> Zbot    986   64  63  34  134 
#> Sirefef   1018   156  125  122  164 
#> Fareit    978   82  64  51  135 
#> Winwebsec   1087   525  490  485  501 
#> FakeSysdef   1052   196  163  152  204 
#> Winwebsec.1   997   217  186  183  199 
#>    Winwebsec.4 Winwebsec.5 Winwebsec.6 Winwebsec.7 
#> Zbot    166   52   337   225 
#> Sirefef    198   120   371   257 
#> Fareit    167   49   338   224 
#> Winwebsec   511   483   600   582 
#> FakeSysdef   234   154   405   285 
#> Winwebsec.1   209   183   348   272 
as.dist(table) 
#>    Zbot Sirefef Fareit Winwebsec FakeSysdef Winwebsec.1 
#> Sirefef  142             
#> Fareit  48  124           
#> Winwebsec 651  499 470         
#> FakeSysdef 178  172 143  535      
#> Winwebsec.1 245  199 168  478  229    
#> Winwebsec.2 986  752 719  821  784   727 
#> Winwebsec.3 80  160  85  506  179   204 
#> Fareit.1  65  95  32  468  141   164 
#> Fareit.2  52  122  51  468  143   166 
#> Sirefef.1 136  118  85  449  150   147 
#> Winwebsec.4 164  138 105  449  170   145 
#> Winwebsec.5 52  116  51  466  143   166 
#> Winwebsec.6 335  267 234  496  301   246 
#> Winwebsec.7 225  204 171  519  228   217 
#>    Winwebsec.2 Winwebsec.3 Fareit.1 Fareit.2 Sirefef.1 
#> Sirefef               
#> Fareit               
#> Winwebsec              
#> FakeSysdef              
#> Winwebsec.1              
#> Winwebsec.2              
#> Winwebsec.3   757           
#> Fareit.1   715   78        
#> Fareit.2   717   68  40     
#> Sirefef.1   696   123  83  83   
#> Winwebsec.4   696   143  103  103  80 
#> Winwebsec.5   717   66  42  32  83 
#> Winwebsec.6   745   272  232  234  213 
#> Winwebsec.7   774   207  169  169  150 
#>    Winwebsec.4 Winwebsec.5 Winwebsec.6 
#> Sirefef           
#> Fareit           
#> Winwebsec          
#> FakeSysdef          
#> Winwebsec.1          
#> Winwebsec.2          
#> Winwebsec.3          
#> Fareit.1          
#> Fareit.2          
#> Sirefef.1          
#> Winwebsec.4          
#> Winwebsec.5   103       
#> Winwebsec.6   213   234    
#> Winwebsec.7   160   169   291 
+0

感謝您的答案,但問題是(tb)我得到這個錯誤:警告信息: 1:在as.dist.default(tb):強制引入NAs 2:在as.dist .default(tb):非方陣 – Alex

+0

問題是不允許重複的行名稱。請參閱'?read.csv'。但是你可以用'make.names'來「解決」這個問題。我會更新答案,然後讓我知道它是否適合你。歡呼 –

+0

更新了答案...請看看它是否有幫助。 –