我正在運行中的R,其接收,看起來像一個錯誤的食譜:使用R,left_join方法不接受數據類型
> left_join(ann2012full,agglevel) Joining by: "agglvl_code" Error in data.table::setkeyv(y, by$x) : x is not a data.table
。
這兩個變量是ann2012full,一個300萬+ obs。 15個變量,和agglevel,一個56 obs。 2個變量,取自2個.csv文件。
根據其他帖子,還有其他人對dplyr的類似問題存在問題,但對於我來說by
方法的框架並不清楚。是否有人能夠重複left_join
函數,因爲它在更新之前?
兩個瓦爾有一個交叉點,並且功能出現在錯誤之前報告Joining by: "agglvl_code"
承認:有問題的變量
> intersect(names(ann2012full),names(agglevel))
[1] "agglvl_code"
第幾行......
head(ann2012full)
area_fips own_code industry_code agglvl_code size_code year qtr disclosure_code annual_avg_estabs_count annual_avg_emplvl
1: 01000 0 10 50 0 2012 A 116233 1828248
2: 01000 1 10 51 0 2012 A 1252 56031
3: 01000 1 102 52 0 2012 A 1252 56031
4: 01000 1 1021 53 0 2012 A 599 11734
5: 01000 1 1022 53 0 2012 A 2 13
6: 01000 1 1023 53 0 2012 A 17 161
total_annual_wages taxable_annual_wages annual_contributions annual_avg_wkly_wage avg_annual_pay
1: 76768801894 13424728725 419383612 808 41990
2: 4194319351 0 0 1440 74857
3: 4194319351 0 0 1440 74857
4: 719641114 0 0 1179 61330
5: 436204 0 0 662 34437
6: 12253089 0 0 1468 76343
head(agglevel)
agglvl_code agglvl_title
1 10 National, Total Covered
2 11 National, Total -- by ownership sector
3 12 National, by Domain -- by ownership sector
4 13 National, by Supersector -- by ownership sector
5 14 National, NAICS Sector -- by ownership sector
6 15 National, NAICS 3-digit -- by ownership sector
什麼樣的問題看起來像str()...
> str(ann2012)
Classes ‘data.table’ and 'data.frame': 3556289 obs. of 15 variables:
$ area_fips : chr "01000" "01000" "01000" "01000" ...
$ own_code : int 0 1 1 1 1 1 1 1 1 1 ...
$ industry_code : chr "10" "10" "102" "1021" ...
$ agglvl_code : int 50 51 52 53 53 53 53 53 53 53 ...
$ size_code : int 0 0 0 0 0 0 0 0 0 0 ...
$ year : int 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
$ qtr : chr "A" "A" "A" "A" ...
$ disclosure_code : chr "" "" "" "" ...
$ annual_avg_estabs_count: int 116233 1252 1252 599 2 17 46 32 27 4 ...
$ annual_avg_emplvl : int 1828248 56031 56031 11734 13 161 1799 6131 903 632 ...
$ total_annual_wages :Class 'integer64' num [1:3556289] 3.79e-313 2.07e-314 2.07e-314 3.56e-315 2.16e-318 ...
$ taxable_annual_wages :Class 'integer64' num [1:3556289] 6.63e-314 0.00 0.00 0.00 0.00 ...
$ annual_contributions :Class 'integer64' num [1:3556289] 2.07e-315 0.00 0.00 0.00 0.00 ...
$ annual_avg_wkly_wage : int 808 1440 1440 1179 662 1468 1581 1231 370 1716 ...
$ avg_annual_pay : int 41990 74857 74857 61330 34437 76343 82237 64031 19257 89240 ...
- attr(*, ".internal.selfref")=<externalptr>
> str(agglevel)
'data.frame': 56 obs. of 2 variables:
$ agglvl_code : int 10 11 12 13 14 15 16 17 18 21 ...
$ agglvl_title: chr "National, Total Covered" "National, Total -- by ownership sector" "National, by Domain -- by ownership sector" "National, by Supersector -- by ownership sector" ...
我有10個庫加載這個配方;總共有28個裝載。
> search()
[1] ".GlobalEnv" "package:tcltk" "package:microbenchmark" "package:rbenchmark" "package:choroplethr"
[6] "package:RColorBrewer" "package:maps" "package:ggplot2" "package:stringr" "package:dplyr"
[11] "package:plyr" "package:sqldf" "package:RSQLite" "package:DBI" "package:gsubfn"
[16] "package:proto" "package:data.table" "package:bit64" "package:bit" "tools:rstudio"
[21] "package:stats" "package:graphics" "package:grDevices" "package:utils" "package:datasets"
[26] "package:methods" "Autoloads" "package:base"
***********************************找到解決辦法***** **************************
我得到了底部:我使用merge
,而不是left_join
,指定by
爲多於NULL
。那麼,究竟是什麼......
codes <- c('agglevel','industry','ownership','size')
ann2012full <- ann2012
for(i in 1:length(codes)){
eval(parse(text=paste('ann2012full <- left_join(ann2012full, ',codes[i],')', sep='')))
}
現在是...
codes <- c('agglevel','industry','ownership','size')
ann2012full <- ann2012
for(i in 1:length(codes)){
barTitle <- intersect(names(ann2012full),names(eval(parse(text=codes[i]))))
eval(parse(text= paste('ann2012full <- merge(ann2012full, ',codes[i],',by="',barTitle,'")', sep='')))
}
然而,似乎***_join
在dplyr方法有缺陷,仍與最新的更新來解決。如果還有其他意見,我很樂意聽到它們,因爲它僅適用於修改後的代碼merge
。
謝謝,
你能提供幾行ann2012full和agglevel嗎?此外,你可以顯示什麼str()返回兩個?最後,如果你顯示你已經加載的「庫的數量」,它可以幫助我們。 – lawyeR 2014-12-05 03:36:54
看起來'dplyr'的行爲在新版本中可能已經改變。 – Arun 2014-12-05 19:26:07
它似乎是這樣的:[rstudio](http://blog.rstudio.org/2014/10/13/dplyr-0-3-2/)有關於10/14/14改進的一些細節,但我沒有看到連接,並且更改顯示爲可選。 – double0darbo 2014-12-05 20:39:45