選擇的ID與指定列屬性

在我試圖讓我的選擇貌似選擇的ID與指定列屬性

 
    ID Field Rank 
8 6 Other Prof 
9 6 Other Prof 
13 7 Other Assoc 
16 7 Other Assoc 
17 7 Other Prof 
18 8 Other Assoc 
19 8 Other Assoc 
22 9 Other Assoc 
23 9 Other Assoc 
24 9 Other Prof

我試圖創建一個包含人（ID）的所有行的新變量的數據已從'Assoc'推廣到'Prof'。例如，我希望我的新變量看起來像

 
    ID Field Rank 
13 7 Other Assoc 
16 7 Other Assoc 
17 7 Other Prof 
22 9 Other Assoc 
23 9 Other Assoc 
24 9 Other Prof

我已經嘗試了子集函數，但沒有運氣。

在R中有一個函數可以做到這一點嗎？如果沒有，那怎麼能實現呢？

編輯：這裏，是從dput()的結果。注意我省略了「Field」變量，因爲它在本例中不包含任何信息。

df.promotion <- structure(list(id = c(6, 6, 7, 7, 7, 8, 8, 9, 9, 9), rank = structure(c(2L, 
2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L), .Label = c("Assoc", "Prof" 
), class = "factor")), .Names = c("id", "rank"), row.names = c(NA, 
-10L), class = "data.frame")

來源

2011-10-25 user7045

歡迎所以只能選擇那些subsets！我爲你和其他人重現了這個例子，下次嘗試這樣做，例如，通過使用dput（）。 –

無論是我的或你的是不需要了:) –

謝謝，我已經離開你的。仍然習慣張貼和R. – user7045

讓我們做這與基礎R（雖然plyr招手）：編輯適應+測試到新設dput輸出...

dfr<-df.promotion #just so I don't have to change too much below 
colnames(dfr)<-c("ID", "Rank") #just so I don't have to change too much below 
promotedIDs<-unique(dfr$ID)[sapply(unique(dfr$ID), function(curID){ 
    hasBoth<-(sum(is.na(match(c("Assoc", "Prof"), dfr$Rank[dfr$ID==curID]))) == 0) 
})] 
result<-dfr[dfr$ID %in% promotedIDs,]

我檢查，與match二者是否「Prof」和「Assoc」在每個ID的排名列表中。請注意，如果未找到值，則match返回NA，因此，計算NA的數目是查找是否匹配的一種方法。

來源

2011-10-25 11:31:52

其他解決方案更漂亮，但多樣性是好的... –

可以使用xtabs通過ID和Rank製表你的數據：

tab <- xtabs(~ID+Rank,dfr) 
tab 
    Rank 
ID Assoc Prof 
    6  0 2 
    7  2 1 
    8  2 0 
    9  2 1

其中沒有出現零您想要的：

subset(dfr,ID %in% rownames(tab[as.logical(apply(tab,1,prod)),])) 
    ID Field Rank 
13 7 Other Assoc 
16 7 Other Assoc 
17 7 Other Prof 
22 9 Other Assoc 
23 9 Other Assoc 
24 9 Other Prof

來源

2011-10-25 11:37:24 James

謝謝，我會玩一玩這個，讓你知道它是怎麼回事。 – user7045

會像'rownames（tab）[apply（tab！= 0,1，all）]更清晰......？ –

@BenBolker是的，這看起來好多了 – James

這是一個相當容易理解，使用方法你首先傾向於使用subset()：

我創建了p其中是所有誰是教授的id然後我創建a這是每個人是誰的副手。然後使用%in%我們選擇所有既是Assoc也是Prof.的人，它給了我一組密鑰，然後我可以使用它來對初始數據進行子集化。

p <- unique(subset(df.promotion, rank=="Prof")$id) 
a <- unique(subset(df.promotion, rank=="Assoc")$id) 

mySet <- a[a %in% p] 
subset(df.promotion, id %in% mySet)

來源

2011-10-25 15:19:57

工作也一樣。謝謝 – user7045

以下是使用plyr的慣用單線紙。該代碼的工作原理是（一）ID，分割數據幀，和（b）大於1個唯一的等級（這是推廣代理）

require(plyr) 
ddply(df.promotion, .(id), subset, length(unique(rank)) > 1)

來源

2011-10-26 05:31:25 Ramnath

選擇的ID與指定列屬性

回答

相關問題