2014-12-06 64 views
0

我有一個評分和用戶表。 我想添加一個名爲「AvRating」的用戶表的新列 對於該列的每一行,我希望每個用戶給出的平均評分。 我遍歷評級表中的所有用戶ID,並獲得所有相應評級的均值。 然而,列「AvRating」包含一堆N/A,這就是全部。在R中添加包含行元素的新列

Ratings = read.table("Ratings.txt", 
       sep="\t", 
       col.names=c("ID", "MId", "Rating"), 
       fill=FALSE, 
       strip.white=TRUE)  


Users = read.table("Users.txt", 
       sep="\t", 
       col.names=c("ID", "Age", "Gender", "Occupation", "ZIP"), 
       fill=FALSE, 
       strip.white=TRUE) 


Users["AvRating"] <- NA 


for(i in 1:943){ # 943 rows in "Ratings" table 

    N = 0 
    x = i 

    # Counting number of ratings by specific User 

    while(Ratings[1, i]==x){ 

     N=N+1 

    } 

    x = i 

    temp = rep(0, N) 

    for(j in 0:N){ 

     temp[j] = Ratings[3, i] 

    } 

    t = mean(temp) 


    Users[6][i] = t 

} 

Users[6]    
+0

您是否嘗試過'rowMeans'? – 2014-12-06 17:33:57

+0

rowMeans或colMeans將不起作用,因爲我不計算整列,我正在計算由ID 1,ID 2,ID 3等進行評分的均值。 – 2014-12-06 17:39:22

+0

基本方法是使用'ave'。 – 2014-12-06 17:41:01

回答

1

使用R,您幾乎不需要循環。使用dplyr

# first load data and dplyr 
library(dplyr) 
user.ave.rating <- Ratings %>% 
    group_by(ID) %>% 
    summarize(AvRating = mean(Rating, na.rm = TRUE)) 
# Join this to your user table 
Users <- left_join(Users, user.ave.rating) 

它也很容易在基礎R,但我覺得語法聚集更難理解/記住:

user.ave.rating <- aggregate(Rating ~ ID, FUN = mean, data = Ratings, na.rm = TRUE) 
names(user.ave.rating)[2] <- "AvRating" 
Users <- merge(Users, user.ave.rating, by = "ID") 
相關問題