2013-04-02 396 views
1

我無法分辨標題名稱爲什麼會得到「X」。當我使用quote =「」導入時使用前綴。下面是代碼:爲什麼R在導入的數據集名稱前加前綴X

xhead = read.csv("~/Desktop/dbdump/users.txt", na.strings = "\\N", quote="", nrows = 1000) 

這給了我:

names(xhead) 
[1] "X.userId."    "X.fullName."   "X.email."    "X.password."   
[5] "X.activated."   "X.registrationDate." "X.locale."    ... 

鑑於:

yhead = read.csv("~/Desktop/dbdump/users.txt", na.strings = "\\N", nrows = 1000) 
names(yhead) 
[1] "userId"    "fullName"   "email"    "password"   
[5] "activated"   "registrationDate" "locale"   ... 

我之所以報價= 「」 是,我得到的記錄大概是因爲截斷埋在我的15000條記錄中,有一段流言。

這裏就是我的數據文件看起來像:

"userId", "fullName","email","password","activated","registrationDate","locale","notifyOnUpdates","lastSyncTime","plan_id","plan_period_months","plan_price","plan_exp_date","plan_is_trial","plan_is_trial_used","q_hear","q_occupation","pp_subid","pp_payments","pp_since","pp_cancelled","apikey" 
"2","Adam Smith","[email protected]","*****","1","2004-07-23 14:19:32","en_US","1","2011-04-07 07:29:17","3",\N,\N,\N,"0","1",\N,\N,\N,\N,\N,\N,"d7734dce-4ae2-102a-8951-0040ca38ff83" 
+0

並不意味着read.csv頭= TRUE然後讀? – pitosalas

+0

你是對的。刪除了我的評論 –

回答

7

列名通過make.names返回前運行。引號對於列名不是有效的字符。你可以看到正在運行的區別:

make.names(c('"userId"', "fullName")) 
[1] "X.userId." "fullName" 

make.names幫助:

語法上有效的名稱由字母,數字和點或下劃線字符,以字母或點不啓動然後是一個數字。 ... 如果需要,字符「X」被預置。所有無效字符都被轉換爲「。」。

一個建議是撥打read.csv跳過第一行,不包括一個頭來獲取大量的數據。

dd <- read.csv("~/Desktop/dbdump/users.txt", na.strings = "\\N", 
     quote="", nrows = 1000, header = FALSE, skip = 1) 

你可以在列名使用scan(這是什麼read.csv是引擎蓋下調用)

names(dd) <- scan("~/Desktop/dbdump/users.txt", what = character(), nlines=1,sep =',') 
相關問題