編號重複R中使用sqldf

我有一個包含重複行的數據集，我也希望他們人數如下：編號重複R中使用sqldf

原始數據集：

DF <- structure(list(pol_no = c(1L, 1L, 2L, 2L, 2L), os = c(23L, 33L, 
45L, 56L, 45L), paid = c(45L, 67L, 78L, 89L, 78L)), .Names = c("pol_no", 
"os", "paid"), class = "data.frame", row.names = c(NA, -5L))

，看起來像這樣：

> DF 
    pol_no os paid 
1  1 23 45 
2  1 33 67 
3  2 45 78 
4  2 56 89 
5  2 45 78

，我希望在pol_no與數重複的如下：

pol_no os paid count 
1  23 45  1 
1  33 67  2 
2  45 78  1 
2  56 89  2 
2  45 78  3

非常感謝。

問候，

曼西

編輯：加入dput()輸出，使之可再現的和固定的格式。

來源

2012-06-25 user1480178

sqldf與RPostgreSQL

的這類問題的PostgreSQL的faciliate解決方案SQL窗口功能。有關使用PostgreSQL與sqldf更多信息，請參見上sqldf home page FAQ#12：

library(RPostgreSQL) 
library(sqldf) 
sqldf('select *, rank() over (partition by "pol_no" order by CTID) count 
     from "DF" 
     order by CTID ')

sqldf與RSQLite

sqldf通過RSQLite默認使用的SQLite。雖然SQLite缺少PostgreSQL的窗口函數，但是SQLite的安裝過程要簡單得多，因爲它不需要額外的工作（而PostgreSQL，PostgreSQL本身必須單獨安裝和配置）。缺乏這些設施的使用SQLite的SQL語句比較複雜，儘管SQL語句的長度其實是相似的：

# if RPostgreSQL was previously attached & loaded then detach and & unload it 
detach("package:RPostgreSQL", unload = TRUE) 

sqldf("select a.*, count(*) count 
     from DF a, DF b 
     where a.pol_no = b.pol_no and b.rowid <= a.rowid group by a.rowid" 
)

的r AVE

最後，我們表明，不使用sqldf在所有，但一個解決方案只是核心R功能：

transform(DF, count = ave(pol_no, pol_no, FUN = seq_along))

來源

2012-06-25 15:22:06

編號重複R中使用sqldf

回答

相關問題