2017-10-18 16 views
1

我有以下data.frame如何對特定的組元素應用條件並從同一個表中的另一個組中找到排列?

Category Product Status 
1  A  qwe  In 
2  A  rty  In 
3  A  ewq Out 
4  B  dfs  In 
5  B  sgf  In 
6  C  mnb Out 
7  C  ves Out 
8  C  klm Out 
9  C  nbc Out 

我的目標是從類別在水平OnlyIn創建列標誌每組,OnlyOutBothInOut ,對應於狀態列中的值。

作爲它的一部分,我計算的計數在進出每組使用下面的代碼:

Data <- Data %>% 
    group_by(Category,Status) %>% 
    dplyr::mutate(InCounts = length(Status[Status == "in"]), 
       OutCounts = length(Status[Status == "out"])) 

而且我得到了如下的結果:

Category Product Status CountIn CountOut 
1  A  qwe  In  2  0 
2  A  rty  In  2  0 
3  A  ewq Out  0  1 
4  B  dfs  In  2  0 
5  B  sgf  In  2  0 
6  C  mnb Out  0  4 
7  C  ves Out  0  4 
8  C  klm Out  0  4 
9  C  nbc Out  0  4 

現在,我不確定如何利用這些信息來創建新的列標誌,方法是計算每個類別的總入出量並添加適當的值。

實施例,如果有進出既作爲一個類別狀態是,則該標誌應改爲「BothInOut」

樣本輸出:

Category Product Status CountIn CountOut  Flag 
1  A  qwe  In  2  0 BothInOut 
2  A  rty  In  2  0 BothInOut 
3  A  ewq Out  0  1 BothInOut 
4  B  dfs  In  2  0 OnlyIn 
5  B  sgf  In  2  0 OnlyIn 
6  C  mnb Out  0  4 OnlyOut 
7  C  ves Out  0  4 OnlyOut 
8  C  klm Out  0  4 OnlyOut 
9  C  nbc Out  0  4 OnlyOut 

用於數據

重現的輸入
structure(list(Category = c("A", "A", "A", "B", "B", "C", "C", 
"C", "C"), Product = c("qwe", "rty", "ewq", "dfs", "sgf", "mnb", 
"ves", "klm", "nbc"), Status = c("In", "In", "Out", "In", "In", 
"Out", "Out", "Out", "Out"), CountIn = c(2, 2, 0, 2, 2, 0, 0, 
0, 0), CountOut = c(0, 0, 1, 0, 0, 4, 4, 4, 4), Flag = c("BothInOut", 
"BothInOut", "BothInOut", "OnlyIn", "OnlyIn", "OnlyOut", "OnlyOut", 
"OnlyOut", "OnlyOut")), .Names = c("Category", "Product", "Status", 
"CountIn", "CountOut", "Flag"), row.names = c(NA, 9L), class = "data.frame") 
+2

'df%>%group_by(Category)%> mutate(Flag1 = toString(unique(Status)))' – Sotos

+0

然後就完成了。 – sunitprasad1

回答

0

裁判@Sotos評論:

Data <- Data %>% group_by(Category) %>% mutate(Flag1 = toString(unique(Status))) 

Data$Flag <- ifelse(Data$Flag1 == "In","OnlyIn", 
        ifelse(Data$Flag1 == "Out","OnlyOut","BothInOut")) 

獲取所做的工作。

Category Product Status Flag1  Flag2 
1  A  qwe  In In, Out BothInOut 
2  A  rty  In In, Out BothInOut 
3  A  ewq Out In, Out BothInOut 
4  B  dfs  In  In OnlyIn 
5  B  sgf  In  In OnlyIn 
6  C  mnb Out  Out OnlyOut 
7  C  ves Out  Out OnlyOut 
8  C  klm Out  Out OnlyOut 
9  C  nbc Out  Out OnlyOut 
1

我會說@Sotos評論做得很好,另一種方法來獲得你想要的確切標籤將是

df <- df %>% 
    group_by(Category) %>% 
    mutate(Flag2 = ifelse("In" %in% unique(Status) & "Out" %in% unique(Status), "BothInOut", ifelse("In" %in% unique(Status), "OnlyIn", "OnlyOut"))) 

> df 
Source: local data frame [9 x 7] 
Groups: Category [3] 

# A tibble: 9 x 7 
    Category Product Status CountIn CountOut  Flag  Flag2 
    <chr> <chr> <chr> <dbl> <dbl>  <chr>  <chr> 
1  A  qwe  In  2  0 BothInOut BothInOut 
2  A  rty  In  2  0 BothInOut BothInOut 
3  A  ewq Out  0  1 BothInOut BothInOut 
4  B  dfs  In  2  0 OnlyIn OnlyIn 
5  B  sgf  In  2  0 OnlyIn OnlyIn 
6  C  mnb Out  0  4 OnlyOut OnlyOut 
7  C  ves Out  0  4 OnlyOut OnlyOut 
8  C  klm Out  0  4 OnlyOut OnlyOut 
9  C  nbc Out  0  4 OnlyOut OnlyOut 
1

我會建議做@Sotos評論更穩健,即標籤的順序不應該依賴於數據的順序加入sort

df %>% group_by(Category) %>% 
    mutate(Flag1 = toString(sort(unique(Status))) 

如果你想擁有標註爲你建議的數據,你可以把它擴展到:

df %>% group_by(Category) %>% 
    mutate(Flag1 = paste0(sort(unique(Status)), collapse = "") %>% 
       paste0(ifelse(. == "InOut", "Both", "Only"), .)) 

其中產量:

Category Product Status CountIn CountOut  Flag  Flag1 
    <chr> <chr> <chr> <dbl> <dbl>  <chr>  <chr> 
1  A  qwe  In  2  0 BothInOut BothInOut 
2  A  rty  In  2  0 BothInOut BothInOut 
3  A  ewq Out  0  1 BothInOut BothInOut 
4  B  dfs  In  2  0 OnlyIn OnlyIn 
5  B  sgf  In  2  0 OnlyIn OnlyIn 
6  C  mnb Out  0  4 OnlyOut OnlyOut 
7  C  ves Out  0  4 OnlyOut OnlyOut 
8  C  klm Out  0  4 OnlyOut OnlyOut 
9  C  nbc Out  0  4 OnlyOut OnlyOut 
相關問題