2017-03-04 40 views
1

我有一個數據框,其中包含一個來自gnucash mysql數據庫的帳戶的子字段和父字段。我想將帳戶層次結構存儲在數據框中。過去,我在mySQL中使用了遞歸連接,但隨着層次越來越深,它變得非常繁瑣。你也必須知道你的樹有多少層次。我希望有一種更簡單的方法來構建層次結構(有或沒有最大深度的知識)。R層次數據上的遞歸合併

的樣本數據:

account_id <- c(1:11) 
account_name <- c('root_account','dining', 'food', 'discretionary_expense', 
        'expenses', 'base_salary_wife', 'base_salary_husband', 
        'base_salary', 'salary', 'taxable_income', 
        'income') 
account_parentid <- c(NA,3,4,5,1,8,8,9,10,11,1) 
test.data <- data.frame(account_id, account_name, account_parentid) 

所需的輸出:

account_id   account_name account_parentid lvl2_parentid lvl3_parentid lvl4_parentid lvls 
1   1   root_account    NA   NA   NA   NA NA 
2   2    dining    3    4    6   NA 4 
3   3     food    4    5   NA   NA 3 
4   4 discretionary_expense    5   NA   NA   NA 2 
5   5    expenses    1   NA   NA   NA 1 
6   6  base_salary_wife    8    9   10   11 5 
7   7 base_salary_husband    8    9   10   11 5 
8   8   base_salary    9   10   11   NA 4 
9   9    salary    10   11   NA   NA 3 
10   10  taxable_income    11   NA   NA   NA 2 
11   11    income    1   NA   NA   NA 1 

回答

1

您可以使用data.tree包分層數據的工作:

獲取測試數據:

account_id <- c(1:11) 
account_name <- c('root_account','dining', 'food', 'discretionary_expense', 
        'expenses', 'base_salary_wife', 'base_salary_husband', 
        'base_salary', 'salary', 'taxable_income', 
        'income') 
account_parentid <- c(NA,3,4,5,1,8,8,9,10,11,1) 
test.data <- data.frame(account_id, account_parentid, account_name, stringsAsFactors = F) 

轉換t Ødata.tree結構:

library(data.tree) 
tree1 <- FromDataFrameNetwork(test.data[-1,]) 
tree1$account_name <- 'root_account' 

顯示:

ToDataFrameTree(tree1, account = 'name', 'account_name', 'pathString') 

這將顯示如下所示:

   levelName account   account_name pathString 
1 1       1   root_account    1 
2 ¦--5      5    expenses   1/5 
3 ¦ °--4     4 discretionary_expense   1/5/4 
4 ¦  °--3    3     food  1/5/4/3 
5 ¦   °--2   2    dining  1/5/4/3/2 
6 °--11      11    income   1/11 
7  °--10     10  taxable_income  1/11/10 
8   °--9    9    salary  1/11/10/9 
9    °--8   8   base_salary 1/11/10/9/8 
10     ¦--6  6  base_salary_wife 1/11/10/9/8/6 
11     °--7  7 base_salary_husband 1/11/10/9/8/7 

的問題不是一部分,但它真正變得有趣的是,當你希望總結層次結構等。請參閱data.tree小插曲herehere