2016-02-10 61 views
0

要創建一些圖,我已經使用以下方法彙總了我的數據,其中包括所有需要的信息。取決於矢量的訂單清單

# Load Data 
RawDataSet <- read.csv("http://pastebin.com/raw/VP6cF31A", sep=";") 
# Load packages 
library(plyr) 
library(dplyr) 
library(tidyr) 
library(ggplot2) 
library(reshape2) 

# summarising the data 
new.df <- RawDataSet %>% 
    group_by(UserEmail,location,context) %>% 
    tally() %>% 
    mutate(n2 = n * c(1,-1)[(location=="NOT_WITHIN")+1L]) %>% 
    group_by(UserEmail,location) %>% 
    mutate(p = c(1,-1)[(location=="NOT_WITHIN")+1L] * n/sum(n)) 

通過一些其他分析,我確定了不同的用戶組。由於我想繪製我的數據,因此將繪圖以正確的順序顯示我的數據會很棒。 的順序是根據USEREMAIL和由以下定義:

order <- c("28","27","25","23","22","21","20","16","12","10","9","8","5","4","2","1","29","19","17","15","14","13","7","3","30","26","24","18","11","6") 

問我new.dftypeof(new.df)的類型,它說,這是一個list。我已經嘗試了一些方法,如order_by或with_order,但我直到現在我還沒有設法訂購我的new.df,這取決於我的order -vector。當然,訂單流程也可以在彙總部分完成。 有沒有辦法做到這一點?

+1

只是'dplyr :: arrange'。 'typeof' data.frame是一個列表(它在技術上是); 'class'告訴你它是否實際上是一個'data.frame'。 – alistaire

回答

2

我無法自己創建一個名爲order的向量,不以此名稱來尊重R函數。使用match構建的指數爲基礎order ING使用(如函數):

sorted.df <- new.df[ order(match(new.df$UserEmail, as.integer(c("28","27","25","23","22","21","20","16","12","10","9","8","5","4","2","1","29","19","17","15","14","13","7","3","30","26","24","18","11","6")))), ] 
head(sorted.df) 
#--------------- 
Source: local data frame [6 x 6] 
Groups: UserEmail, location [4] 

    UserEmail location context  n n2   p 
     (int)  (fctr) (fctr) (int) (dbl)  (dbl) 
1  28 NOT_WITHIN Clicked A 16 -16 -0.8421053 
2  28 NOT_WITHIN Clicked B  3 -3 -0.1578947 
3  28  WITHIN Clicked A  2  2 1.0000000 
4  27 NOT_WITHIN Clicked A  4 -4 -0.8000000 
5  27 NOT_WITHIN Clicked B  1 -1 -0.2000000 
6  27  WITHIN Clicked A  1  1 1.0000000 

(我沒有加載plyr或reshape2因爲這些包中的至少一個具有相互作用的壞習慣不好用dplyr函數)。

+0

謝謝:)工作就像一個魅力。不幸的是,我遇到了另一個問題,它涉及到這個問題,但是這是關於ggplot的問題.... http://stackoverflow.com/questions/35324848/reorder-data-in-ggplot-after-successfully -reorder-底層數據 – schlomm