2017-07-05 79 views
0

我試圖找到每隊平均打擊率原子序數和平均值。我有一個matrix類似於我有如下:R:在列表中

 bat_avg  team_name 
    [1,] "0.5"  "Rockies" 
    [2,] "0"   "Astros" 
    [3,] "0.5"  "Rockies" 
    [4,] "0"   "Padres" 
    [5,] "0"   "Padres" 
    [6,] "0"   "Rockies" 
    [7,] "0"   "Mets" 
    [8,] "0.4"  "Red Sox" 
    [9,] "0"   "Yankees" 
    [10,] "0"   "Rockies" 

要找到每隊平均軟化平均我試圖矩陣變換成數據幀,試圖通過球隊的名字來彙總數據。我一直在收到我的數據類型是原子的錯誤。我不確定該如何解決這個問題。我是全新的R和編碼,所以感謝您的幫助!

bat_avg <- Batting_average[,26] 
team_name <- Batting_average[,100] 
Batting_average <- cbind(bat_avg, team_name) 
df.Batting_average <- as.data.frame(Batting_average) 

aggdata <- aggregate(Batting_average$team_name, by list(Batting_average$bat_avg], 
FUN = mean) 

下面是我的數據的頂部

structure(c("0.5", "0", "0.5", "0", "0", "0", "Rockies", "Rockies", 
"Rockies", "Rockies", "Rockies", "Rockies"), .Dim = c(6L, 2L),n.Dimnames = list(
NULL, c("bat_avg", "team_name"))) 
+0

我試圖dput()我的數據,但數據的長度,防止我張貼。 –

+0

創建一個最小的例子,然後例如:輸入(頭(數據)) –

+0

我認爲你需要'聚合(Batting_average $ team_name,by = list(Batting_average $ bat_avg),FUN = mean)' –

回答

0

下面是一些代碼,給了合理的結果我

myDf <- data.frame(
    avg = c(0.5, 0, 0.5, 0, 0, 0, 0.4), 
    team = c("Rockies", "Rockies", 
    "Rockies", "Rockies", "Rockies", "Rockies", "Astros") 
) 

myDf 
# avg team 
# 1 0.5 Rockies 
# 2 0.0 Rockies 
# 3 0.5 Rockies 
# 4 0.0 Rockies 
# 5 0.0 Rockies 
# 6 0.0 Rockies 
# 7 0.4 Astros 

aggregate(myDf$avg, by = list(myDf$team), FUN = mean) 
# Group.1   x 
# 1 Astros 0.4000000 
# 2 Rockies 0.1666667 

如果你想與您的數據運行此,更換myDf

myDf <- data.frame(
    avg = as.numeric(Batting_average[,26]), 
    team = as.factor(Batting_average[,100]) 
) 

如果您的數據已經在正確的格式的方法as.numericas.factor可能是不必要的。

+0

如何將此應用於具有27000行的數據框?定義「avg」和「team」時,這與您擁有的有何不同? –

+0

數據集的大小在'aggregate'中沒有什麼區別,只要確保數據格式正確並通過這段代碼運行即可。我將編輯我的答案以使其更清晰 –

+0

當我將它輸入到我的。代碼,我收到以下錯誤:「錯誤的aggregate.data.frame(as.data.frame(X),...): 沒有行彙總」對不起,什麼M是一個愚蠢的問題。我剛剛開始與R一起工作。 –