2014-10-29 68 views
1

我有非唯一鍵的數據表:與非唯一鍵,唯一加盟我

> dput(sv) 
structure(list(kwd = c("a", "a", "b", "b", "c"), pixel = c(1, 
2, 1, 2, 2), kpN = c(2L, 2L, 2L, 1L, 1L)), row.names = c(NA, 
-5L), class = c("data.table", "data.frame"), .Names = c("kwd", 
"pixel", "kpN"), .internal.selfref = <pointer: 0x7fc4aa800778>, sorted = "kwd") 
> dput(kwd) 
structure(list(kwd = c("a", "b", "c", "z"), kwdN = c(3L, 2L, 
1L, 1L)), row.names = c(NA, -4L), class = c("data.table", "data.frame" 
), .Names = c("kwd", "kwdN"), .internal.selfref = <pointer: 0x7fc4aa800778>, sorted = "kwd") 

爲什麼我收到此錯誤:

> sv[kwd,kwdN:=kwdN] 
Starting bmerge ...done in 0 secs 
Error in vecseq(f__, len__, if (allow.cartesian || notjoin) NULL else as.integer(max(nrow(x), : 
    Join results in 6 rows; more than 5 = max(nrow(x),nrow(i)). Check for duplicate key values in i, each of which join to the same group in x over and over again. If that's ok, try including `j` and dropping `by` (by-without-by) so that j runs for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and datatable-help for advice. 
Calls: [ -> [.data.table -> vecseq 

我希望這樣的事情(注意鍵:

kwd pixel kpN kwdN 
1: a  1 2 3 
2: a  2 2 3 
3: b  1 2 2 
4: b  2 1 2 
5: c  2 1 1 

而且,我敢肯定,這工作之前那樣

這是什麼改變了data.table 1.9.4

我如何得到我想要的? (kwd[sv]似乎工作,是新的方式?)

+0

試試'sv [kwd,kwdN:= i.kwdN]' – akrun 2014-10-29 16:01:23

+0

'allow.cartesian'錯誤不應該在這裏彈出。這已在1.9.5中修復。檢查點8下的錯誤修復爲1.9.5 [這裏](https://github.com/Rdatatable/data.table/blob/master/README.md)。當'i'重複時,那麼就像錯誤信息已經說過的那樣,你應該使用'allow.cartesian = TRUE'。 – Arun 2014-10-29 16:03:47

+0

@阿倫:我有1.9.4 – sds 2014-10-29 16:10:16

回答

1

正是如此,這仍然回答:

allow.cartesian功能是從後@Roland this後實施。另請參閱this以獲取更多解釋。

例,其中allow.cartesian是沒有必要的(因此不應該錯誤)爲:

  • i時沒有重複#742 - 這不是之前正確檢查。固定在1.9.5(當前的開發版本)。

  • j:=#800 - 行的數量絕不會超過x。固定在1.9.5(當前的開發版本)。

  • 當操作是未加入(或反連接),#698 - 行數將永遠不會再次超過x。修正於1.9.4。

總之,allow.cartesian錯誤只發生在必要的地方。在CRAN上發佈1.9.6時,1.9.5中所做的修復將可用(應該很快就會發布)。