2017-06-18 96 views
1

概述:轉換[R data.frame列Arules交易

我需要轉換爲Arules交易以下data.frame柱(T $標籤):

  1. 斯卡拉
  2. ios,button,swift3,編譯錯誤,null
  3. c#,通過引用,不安全指針
  4. 彈簧,行家,彈簧-MVC,彈簧的安全性,彈簧java的配置
  5. 機器人,機器人片段,機器人-fragmentmanager
  6. 階,階的集合
  7. 蟒-2.7,蟒-3。的x,matplotlib,情節

由於該數據已經在籃格式和Arules文檔中下面的示例3(https://cran.r-project.org/web/packages/arules/arules.pdf,90頁。)是否通過執行以下操作轉換柱:

###################################################################################################### 
#Option 1 - converting data.frame as described in the documentation (page 90) 
###################################################################################################### 
## example 3: creating transactions from data.frame 
a_df <- data.frame(
    Tags = as.factor(c("scala", 
         "ios, button, swift3, compiler-errors, null", 
         "c#, pass-by-reference, unsafe-pointers", 
         "spring, maven, spring-mvc, spring-security, spring-java-config", 
         "android, android-fragments, android-fragmentmanager", 
         "scala, scala-collections", 
         "python-2.7, python-3.x, matplotlib, plot")) 
) 
## coerce 
trans3 <- as(a_df, "transactions") 
rules <- apriori(trans3, parameter = list(sup = 0.1, conf = 0.5, target="rules",minlen=1)) 
rules_output <- as(rules,"data.frame") 
## Result: 0 rules 
###################################################################################################### 
# Option 2 - reading from a CSV file, which contains exactly the same data 
# above without the header and the quotes 
###################################################################################################### 
file = "Test.csv" 
trans3 = read.transactions(file = file, sep = ",", format = c("basket")) 
rules <- apriori(trans3, parameter = list(sup = 0.1, conf = 0.5, target="rules",minlen=1)) 
rules_output <- as(rules,"data.frame") 
## Result: 198 rules 

選項1 - 結果= 規則

選擇2 - 結果= 規則


問:

在我目前的任務和環境我不能負擔得起保存data.frame列以形成(CSV或任何其他),然後重新閱讀read.transactions(將選項1轉換爲選項2)。 如何將data.frame列轉換爲正確的格式以便正確使用apriori算法的Arules

回答

2

看看? transactions中的例子。您需要包含項目向量(項目標籤)的列表,而不是data.frame

items <- strsplit(as.character(a_df$Tags), ", ") 
trans3 <- as(items, "transactions") 

rules <- apriori(trans3, parameter = list(sup = 0.1, conf = 0.5, target="rules",minlen=1)) 
Apriori 

Parameter specification: 
confidence minval smax arem aval originalSupport maxtime support minlen maxlen 
     0.5 0.1 1 none FALSE   TRUE  5  0.1  1  10 
target ext 
    rules FALSE 

Algorithmic control: 
filter tree heap memopt load sort verbose 
    0.1 TRUE TRUE FALSE TRUE 2 TRUE 

Absolute minimum support count: 0 

set item appearances ...[0 item(s)] done [0.00s]. 
set transactions ...[22 item(s), 7 transaction(s)] done [0.00s]. 
sorting and recoding items ... [22 item(s)] done [0.00s]. 
creating transaction tree ... done [0.00s]. 
checking subsets of size 1 2 3 4 5 done [0.00s]. 
writing ... [198 rule(s)] done [0.00s]. 
creating S4 object ... done [0.00s]. 
+0

非常感謝邁克爾,這正是我所需要的。 – UncleDo