在python熊貓中實現Apriori的最佳方式

在pandas中實現Apriori算法的最佳方式是什麼？到目前爲止，我被困在轉換中使用for循環提取出模式。來自for循環的所有內容都不起作用。在熊貓中有矢量化的方法嗎？在python熊貓中實現Apriori的最佳方式

import pandas as pd 
import numpy as np 

trans=pd.read_table('output.txt', header=None,index_col=0) 

def apriori(trans, support=4): 
    ts=pd.get_dummies(trans.unstack().dropna()).groupby(level=1).sum() 
    #user input 

    collen, rowlen =ts.shape 

    #max length of items 
    tssum=ts.sum(axis=1) 
    maxlen=tssum.loc[tssum.idxmax()] 

    items=list(ts.columns) 

    results=[] 
    #loop through items 
    for c in range(1, maxlen): 
     #generate patterns 
     pattern=[] 
     for n in len(pattern): 
      #calculate support 
      pattern=['supp']=pattern.sum/rowlen 
      #filter by support level 
      Condit=pattern['supp']> support 
      pattern=pattern[Condit] 
      results.append(pattern) 
    return results 

results =apriori(trans) 
print results

當我插入這個與支持3

 a b c d e 
0      
11  1 1 1 0 0 
666  1 0 0 1 1 
10101 0 1 1 1 0 
1010 1 1 1 1 0 
414147 0 1 1 0 0 
10101 1 1 0 1 0 
1242 0 0 0 1 1 
101  1 1 1 1 0 
411  0 0 1 1 1 
444  1 1 1 0 0

它應該輸出像

Pattern support 
    a   6 
    b   7 
    c   7 
    d   7 
    e   3 
    a,b  5 
    a,c  4 
    a,d  4

來源

2013-12-13 user3084006

你的回報是在錯誤的地方，並在LEN（模式），n是錯的太.... –

@AndyHayden第一個是從粘貼錯誤，當我做手工模式長度不工作，因爲我還沒有想出如何生成模式組合，如a，b; A，C;或a，b，c – user3084006

如何定義支持？我有一個猜測，但它不符合你的a，d值（我認爲它會是4，但你說它是3.） – DSM

假設我理解你以後，也許

from itertools import combinations 
def get_support(df): 
    pp = [] 
    for cnum in range(1, len(df.columns)+1): 
     for cols in combinations(df, cnum): 
      s = df[list(cols)].all(axis=1).sum() 
      pp.append([",".join(cols), s]) 
    sdf = pd.DataFrame(pp, columns=["Pattern", "Support"]) 
    return sdf

會讓你開始：

>>> s = get_support(df) 
>>> s[s.Support >= 3] 
    Pattern Support 
0  a  6 
1  b  7 
2  c  7 
3  d  7 
4  e  3 
5  a,b  5 
6  a,c  4 
7  a,d  4 
9  b,c  6 
10  b,d  4 
12  c,d  4 
14  d,e  3 
15 a,b,c  4 
16 a,b,d  3 
21 b,c,d  3 

[15 rows x 2 columns]

來源

2013-12-13 04:21:43 DSM

是的，就是這樣。但是有沒有辦法只用熊貓呢？ – user3084006

@ user3084006：我不確定，但不幸的是我沒時間花這個問題了。希望別人能幫助你！ – DSM

謝謝你解決了基本問題我應該發表另一個問題 – user3084006

在python熊貓中實現Apriori的最佳方式

回答

相關問題