2014-01-09 70 views
0

我有以下數據框;基於具有因子的列值的行總和

Fruit <- c("orange", "orange", "apple", "pineapple", "lemon", "apple", "orange") 

Name <- c("julius", "julius", "john", "mary", "kathy", "john", "julius") 

df <- data.frame(Fruit, Name);df 

我的目標是讓每個人吃的所有水果數量總和,以便最後得到下面的表格;

  orange apple pineapple lemon 
julius 2  1  
john    2  
mary      1 
kathy  1       1 

我在試用聚合函數,但只能設法讓它輸出每個人吃的水果總數如下;

df2 <- aggregate(Fruit~Name,df,length); df2 

輸出是;

Name Fruit 
1 john  2 
2 julius  3 
3 kathy  1 
4 mary  1 

任何幫助將不勝感激。由於

回答

4

選項1個

library(reshape2) 
dcast(df, Name~Fruit) 
    Name apple lemon orange pineapple 
1 john  2  0  0   0 
2 julius  0  0  3   0 
3 kathy  0  1  0   0 
4 mary  0  0  0   1 

選項2

table(df) 
# as pointed out by lebatsnok, the general command would be with(df, table(Fruit, Name)) 
      Name 
Fruit  john julius kathy mary 
    apple  2  0  0 0 
    lemon  0  0  1 0 
    orange  0  3  0 0 
    pineapple 0  0  0 1 
+0

感謝@Codoremifa。你做得這麼簡單。我使用第二個選項,雖然自第一次拋出以下錯誤「package'dcast'不可用(對於R版本3.0.2)」 – kigode

+0

'table(df)'在這種情況下工作,因爲您沒有任何其他變量在數據框中。作爲一般情況,'with(df,table(Fruit,Name))'更好。 – lebatsnok

+0

謝謝@lebatsnok。 – TheComeOnMan

2

看起來你想要一個簡單的雙向頻率表:

table(Fruit, Name) 
#   Name 
#Fruit  john julius kathy mary 
# apple  2  0  0 0 
# lemon  0  0  1 0 
# orange  0  3  0 0 
# pineapple 0  0  0 1 
1
> library(gmodels) 
> 
> CrossTable(Fruit, Name) 


    Cell Contents 
|-------------------------| 
|      N | 
| Chi-square contribution | 
|   N/Row Total | 
|   N/Col Total | 
|   N/Table Total | 
|-------------------------| 


Total Observations in Table: 7 


      | Name 
     Fruit |  john | julius |  kathy |  mary | Row Total | 
-------------|-----------|-----------|-----------|-----------|-----------| 
     apple |   2 |   0 |   0 |   0 |   2 | 
      |  3.571 |  0.857 |  0.286 |  0.286 |   | 
      |  1.000 |  0.000 |  0.000 |  0.000 |  0.286 | 
      |  1.000 |  0.000 |  0.000 |  0.000 |   | 
      |  0.286 |  0.000 |  0.000 |  0.000 |   | 
-------------|-----------|-----------|-----------|-----------|-----------| 
     lemon |   0 |   0 |   1 |   0 |   1 | 
      |  0.286 |  0.429 |  5.143 |  0.143 |   | 
      |  0.000 |  0.000 |  1.000 |  0.000 |  0.143 | 
      |  0.000 |  0.000 |  1.000 |  0.000 |   | 
      |  0.000 |  0.000 |  0.143 |  0.000 |   | 
-------------|-----------|-----------|-----------|-----------|-----------| 
     orange |   0 |   3 |   0 |   0 |   3 | 
      |  0.857 |  2.286 |  0.429 |  0.429 |   | 
      |  0.000 |  1.000 |  0.000 |  0.000 |  0.429 | 
      |  0.000 |  1.000 |  0.000 |  0.000 |   | 
      |  0.000 |  0.429 |  0.000 |  0.000 |   | 
-------------|-----------|-----------|-----------|-----------|-----------| 
    pineapple |   0 |   0 |   0 |   1 |   1 | 
      |  0.286 |  0.429 |  0.143 |  5.143 |   | 
      |  0.000 |  0.000 |  0.000 |  1.000 |  0.143 | 
      |  0.000 |  0.000 |  0.000 |  1.000 |   | 
      |  0.000 |  0.000 |  0.000 |  0.143 |   | 
-------------|-----------|-----------|-----------|-----------|-----------| 
Column Total |   2 |   3 |   1 |   1 |   7 | 
      |  0.286 |  0.429 |  0.143 |  0.143 |   | 
-------------|-----------|-----------|-----------|-----------|-----------|