之所以「N」失蹤是因爲值保持爲numSummaryObj$n
,而其他勘探值保持爲numSummaryObj$table
。
裝回去需要簡單cbind
或data.frame
命令:
file <- "https://vincentarelbundock.github.io/Rdatasets/csv/datasets/ToothGrowth.csv"
toothGrowth <- read.table(file, header=T, sep=",", row.names=1, na.strings="NA", dec=".", strip.white=TRUE)
numSumTooth <- RcmdrMisc::numSummary(toothGrowth[, c("len", "dose")])
nST <- data.frame(numSumTooth$table, numSumTooth$n)
names(nST) <- c(colnames(numSumTooth$table), "n")
write.csv(nST, "numSumTooth.csv")
==
編輯:
我個人投資的某個時候的數據處理與像dplyr
和tidyr
包,因爲它們在未來給你很大的里程和靈活性。例如,爲了產生一個data.frame相同numSummary,您可以運行以下命令:
toothGrowth %>%
select(-supp) %>%
gather(var, val) %>% #convert the wide data frame into the long-form, with var = dose and len
group_by(var) %>%
summarise(mean = mean(val), sd = sd(val),
IQR = IQR(val),
`0%`= min(val),
`25%` = quantile(val, 0.25),
`50%` = median(val),
`75%` = quantile(val, .75),
`100%` = max(val),
n = n())
# A tibble: 2 × 10
var mean sd IQR `0%` `25%` `50%` `75%` `100%` n
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 dose 1.166667 0.6288722 1.5 0.5 0.500 1.00 2.000 2.0 60
2 len 18.813333 7.6493152 12.2 4.2 13.075 19.25 25.275 33.9 60
在這種方法的靈活性是,你可以選擇找意味着每個組(如supp
在這種情況下):
toothGrowth %>%
# select(-supp) %>%
gather(var, val, -supp) %>%
group_by(supp, var) %>%
summarise(mean = mean(val), sd = sd(val),
IQR = IQR(val),
`0%`= min(val),
`25%` = quantile(val, 0.25),
`50%` = median(val),
`75%` = quantile(val, .75),
`100%` = max(val),
n = n())
Source: local data frame [4 x 11]
Groups: supp [?]
supp var mean sd IQR `0%` `25%` `50%` `75%` `100%` n
<fctr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 OJ dose 1.166667 0.6342703 1.5 0.5 0.500 1.0 2.000 2.0 30
2 OJ len 20.663333 6.6055610 10.2 8.2 15.525 22.7 25.725 30.9 30
3 VC dose 1.166667 0.6342703 1.5 0.5 0.500 1.0 2.000 2.0 30
4 VC len 16.963333 8.2660287 11.9 4.2 11.200 16.5 23.100 33.9 30
==
另一種選擇(如果你覺得重複寫入的長期總結的語法是件苦差事)是創建一個功能,如:
checkVar <- function(varname, data){
val <- data[, varname]
tmp <- data.frame(mean = mean(val),
sd = sd(val),
IQR = IQR(val),
`0%`= min(val),
`25%` = quantile(val, 0.25),
`50%` = median(val),
`75%` = quantile(val, .75),
`100%` = max(val),
n = length(val))
names(tmp) <- c("mean", "sd", "IQR", "`0%`", "`25%`", "`50%`", "`75%`", "`100%`", "n")
rownames(tmp) <- varname
return(tmp)
}
執行自定義功能會給您彙總統計:
checkVar("dose", ToothGrowth)
mean sd IQR `0%` `25%` `50%` `75%` `100%` n
dose 1.166667 0.6288722 1.5 0.5 0.5 1 2 2 60
,並把它們變成一個單一的data.frame涉及的應用功能,例如與lapply
:
do.call(rbind, lapply(c("dose", "len"), checkVar, data=ToothGrowth))
mean sd IQR `0%` `25%` `50%` `75%` `100%` n
dose 1.166667 0.6288722 1.5 0.5 0.500 1.00 2.000 2.0 60
len 18.813333 7.6493152 12.2 4.2 13.075 19.25 25.275 33.9 60
請提供[重複性(http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)爲例,您的問題。我在我的測試中導出'numSummary'沒有問題。 –
感謝您的回覆Adam,我添加了一個腳本來重現我遇到的問題。 – Scottmeup