R：查找非唯一/重複值的數據幀索引

我想從矢量中提取一些值，修改它們並將它們放回原始位置。
我一直在尋找很多，並嘗試瞭解決這個問題的不同方法。恐怕這可能很簡單，但我還沒有看到它。R：查找非唯一/重複值的數據幀索引

創建一個矢量並將其轉換爲數據框。也爲結果創建一個空的數據框。

hight <- c(5,6,1,3) 
hight_df <- data.frame("ID"=1:length(hight), "hight"=hight) 
hight_min_df <- data.frame()

提取每對值的較小值與相應的ID。

for(i in 1:(length(hight_df[,2])-1)) 
{ 
    hight_min_df[i,1] <- which(grepl(min(hight_df[,2][i:(i+1)]), hight_df[,2])) 
    hight_min_df[i,2] <- min(hight_df[,2][i:(i+1)]) 
}

修改提取的值並通過更高的值聚合相同的ID。最後寫回修改後的值。

hight_min_df[,2] <- hight_min_df[,2]+20 
adj_hight <- aggregate(x=hight_min_df[,2],by=list(hight_min_df[,1]), FUN=max) 
hight[adj_hight[,1]] <- adj_hight[,2]

這隻要一個完美的我在hight只有潮頭值工作。如何使用像這樣的矢量運行此腳本：hight <- c(5,6,1,3,5)？

來源

2017-08-25 Jack M

如果hight < - c（5,6,1,3,5）'，預期的輸出是多少？ – BLT

好的，這裏有很多東西需要解壓縮。我建議用管道功能dplyr來代替循環。閱讀小插曲here - 它是一個優秀的資源和

因此，使用dplyr我們可以重寫你的代碼是這樣一個很好的方式進行數據操縱R.：

library(dplyr) 
hight <- c(5,6,1,3,5) #skip straight to the test case 
hight_df <- data.frame("ID"=1:length(hight), "hight"=hight) 

adj_hight <- hight_df %>% 
    #logic psuedo code: if the last hight (using lag() function), 
    # going from the first row to the last, 
    # is greater than the current rows hight, take the current rows value. else 
    # take the last rows value 
    mutate(subst.id = ifelse(lag(hight) > hight, ID, lag(ID)), 
     subst.val = ifelse(lag(hight) > hight, hight, lag(hight)) + 20) %>% 
    filter(!is.na(subst.val)) %>% #remove extra rows 
    select(subst.id, subst.val) %>% #take just the columns we want 
    #grouping - rewrite of your use of aggregate 
    group_by(subst.id) %>% 
    summarise(subst.val = max(subst.val)) %>% 
    data.frame(.) 

#tying back in 
hight[adj_hight[,1]] <- adj_hight[,2] 
print(hight)

，並提供：

[1] 25 6 21 23 5

來源

2017-08-25 16:33:18 Zach

R：查找非唯一/重複值的數據幀索引

回答

相關問題