2015-05-31 24 views
0

計算中值似乎爲a bit of an achilles heel for R(即,no data.frame method)。使用dplyr從數據框中獲取組中值所需的最少打字量是多少?使用dplyr從數據幀中的中間值

my_data <- structure(list(group = c("Group 1", "Group 1", "Group 1", "Group 1", 
"Group 1", "Group 1", "Group 1", "Group 1", "Group 1", "Group 1", 
"Group 1", "Group 1", "Group 1", "Group 1", "Group 1", "Group 2", 
"Group 2", "Group 2", "Group 2", "Group 2", "Group 2", "Group 2", 
"Group 2", "Group 2", "Group 2", "Group 2", "Group 2", "Group 2", 
"Group 2", "Group 2"), value = c("5", "3", "6", "8", "10", "13", 
"1", "4", "18", "4", "7", "9", "14", "15", "17", "7", "3", "9", 
"10", "33", "15", "18", "6", "20", "30", NA, NA, NA, NA, NA)), .Names = c("group", 
"value"), class = c("tbl_df", "data.frame"), row.names = c(NA, 
-30L)) 

library(dplyr) 

# groups 1 & 2 
my_data_groups_1_and_2 <- my_data[my_data$group %in% c("Group 1", "Group 2"), ] 

# compute medians per group 
medians <- my_data_groups_1_and_2 %>% 
    group_by(group) %>% 
    summarize(the_medians = median(value, na.rm = TRUE)) 

其中給出:

Error in summarise_impl(.data, dots) : 
    STRING_ELT() can only be applied to a 'character vector', not a 'double' 
In addition: Warning message: 
In mean.default(sort(x, partial = half + 0L:1L)[half + 0L:1L]) : 
    argument is not numeric or logical: returning NA 

什麼是最省力的解決方法嗎?

+1

也許我在這裏錯過了一個技巧,但是這不是失敗,因爲'is.character(my_data_groups_1_and_2 $ value)'是'TRUE'?添加mutate並將值轉換爲double允許爲我計算中位數。 – ivyleavedtoadflax

回答

1

正如評論說ivyleavedtoadflax,錯誤是通過提供一個非數字或非邏輯參數median造成的,因爲你的value列是character型(可以告訴你,他們不是numeric通過觀察這些數字被引用)。這裏有兩種簡單的方法來解決這個問題:

my_data %>% 
    filter(group %in% c("Group 1", "Group 2")) %>% 
    group_by(group) %>% 
    summarize(the_medians = median(as.numeric(value), na.rm = TRUE)) 

或者

my_data %>% 
    filter(group %in% c("Group 1", "Group 2")) %>% 
    mutate(value = as.numeric(value)) %>% 
    group_by(group) %>% 
    summarize(the_medians = median(value, na.rm = TRUE)) 

要檢查在自己的數據列的type結構,你可以方便地使用

str(my_data) 
#Classes ‘tbl_df’ and 'data.frame': 30 obs. of 2 variables: 
# $ group: chr "Group 1" "Group 1" "Group 1" "Group 1" ... 
# $ value: chr "5" "3" "6" "8" ... 
+0

謝謝,這是完美的,比我想象的要簡單得多。我完全忽略了數字作爲字符類型的錯誤信息 – Ben

相關問題