2017-04-25 54 views
1

這可能被問過很多次,但我只能找到比我更復雜的情況,而且我真的不知道從哪裏開始。我需要一個新的列(Condition)添加到我的數據幀,並根據該列中的值cellNrR:通過匹配另一列中的行來填充列中的行

我的數據幀molten.pC填補行:

cellNr  value 
1 G63 0.000000 
2 G64 8.848623 
3 G65 0.000000 
4 G66 10.788718 
5 B15 5.285402 
6 B16 0.000000 
7 B17 0.000000 
8 C10 0.000000 
9 C11 0.000000 

我想增加一列Condition並填寫這樣的:

cellNr  value  Condition 
1 G63 0.000000 Growth 
2 G64 8.848623 Growth 
3 G65 0.000000 Growth 
4 G66 10.788718 Growth 
5 B15 5.285402 Burst 
6 B16 0.000000 Burst 
7 B17 0.000000 Burst 
8 C10 0.000000 Cellularized 
9 C11 0.000000 Cellularized 

回答

2

我們可以通過提取的第一個字符(substr)在base R做到這一點,轉換爲factor與指定了和levels

molten.pC$Condition <- as.character(factor(substr(molten.pC$cellNr, 1, 1), 
     levels = c("G", "B", "C"), labels = c("Growth", "Burst", "Cellularized"))) 
molten.pC$Condition 
#[1] "Growth"  "Growth"  "Growth"  "Growth"  "Burst" 
#[6] "Burst"  "Burst"  "Cellularized" "Cellularized" 

或者我們可以使用case_whendplyr

library(dplyr) #devel version (soon to be released `0.6.0`) 
molten.pC %>% 
     mutate(Sub = substr(cellNr, 1, 1), 
      Condition = case_when(Sub=="G" ~"Growth", 
            Sub == "B" ~"Burst", 
          TRUE ~"Cellularized")) %>% 
     select(-Sub) 
# cellNr  value Condition 
#1 G63 0.000000  Growth 
#2 G64 8.848623  Growth 
#3 G65 0.000000  Growth 
#4 G66 10.788718  Growth 
#5 B15 5.285402  Burst 
#6 B16 0.000000  Burst 
#7 B17 0.000000  Burst 
#8 C10 0.000000 Cellularized 
#9 C11 0.000000 Cellularized 
+1

添加上面的解決方案我的想法: molten.pC $條件< - 因子(molten.pC $條件,水平= sort(unique(str_sub(molten.pC $ cellNr,1,1))),labels = sort(c(「Growth」,「Burst」,「Cellularized」))) –

+1

太棒了!謝謝!! – Jon

相關問題