如何，使用R編程

員工詳細信息：

EmpID | WorkingPlaces | Salary 
1001 | Bangalore  | 5000 
1001 | Chennai  | 6000 
1002 | Bombay  | 1000 
1002 | Chennai  | 500 
1003 | Pune   | 2000 
1003 | Mangalore  | 1000

一個相同工作的員工在一個月不同的地方。如何找到排名前2的高薪員工。

結果表應該看起來像

EmpID | WorkingPlaces | Salary 
1001 | Chennai  | 6000 
1001 | Bangalore  | 5000 
1003 | Pune   | 2000 
1003 | Mangalore  | 1000

我的代碼：R中的語言

knime.out <- aggregate(x= $"EmpID", by = list(Thema = $"WorkingPlaces", Project = $"Salary"), FUN = "length") [2]

但這並不給我預期的結果。請幫我修改代碼。

來源

2017-02-28 Mathumathi Anblagan

的標準是什麼得到結果 – akrun

的標準是結果表中應該有最高2高薪員工 –

我們可以從dplyr

library(dplyr) 
df1 %>% 
    group_by(EmpID) %>% 
    mutate(SumSalary = sum(Salary)) %>% 
    arrange(-SumSalary, EmpID) %>% 
    head(4) %>% 
    select(-SumSalary)

來源

2017-02-28 09:38:28 akrun

其實我得到的錯誤 - >錯誤：無法找到函數「％>％」 –

@MathuMathi你需要安裝'dplyr'或'tidyverse' – akrun

一個基礎R解決方案嘗試。考慮你的數據框爲df。我們首先通過EmpId的數據aggregate並計算它們的sum。然後，我們選擇工資最高的前2 EmpID，並使用%in%找到原始數據框中這些ID的子集。

temp <- aggregate(Salary~EmpID, df, sum) 
df[df$EmpID %in% temp$EmpID[tail(order(temp$Salary), 2)], ] 

# EmpID WorkingPlaces Salary 
#1 1001  Bangalore 5000 
#2 1001  Chennai 6000 
#5 1003   Pune 2000 
#6 1003  Mangalore 1000

來源

2017-02-28 10:21:50

如何，使用R編程

回答

相關問題