在python中使用樹來獲取值

所以我正在嘗試使用Python創建一個樹，以便能夠嘗試讀取一個文本文件，該文件在文件中有重複的數量，並嘗試從這些值中創建一個樹並返回具有前3個值的句子（下面更詳細地解釋）。在python中使用樹來獲取值

首先，我在wikipedia上搜索了一棵樹是如何創建的，並且還看到了像以前的例子：This one。和This one。不過，我只能根據代碼執行此操作：

import fileinput 

setPhrasesTree = 0 


class Branch(): 
    def __init__(self, value): 
     self.left = None 
     self.right = None 
     self.value = value 

class Tree(): 
    def __init__(self): 
     self.root = None 
     self.found = False 

    #lessThan function needed to compare strings 
    def lessThan(self, a, b):  
     if len(a) < len(b): 
      loopCount = len(a) 
     else: 
      loopCount = len(b)   
     for pos in range(0, loopCount): 
      if a[pos] > b[pos]: 
       return False   
     return True 

    def insert(self, value): 
     self.root = self.insertAtBranch(self.root, value) 

    def exists(self, value): 
     #set the class variable found to False to assume it is not there  
     self.found = False 
     self.findAtBranch(self.root, value) 
     return self.found 

    #Used to fine a value in a tree 
    def findAtBranch(self, branch, value):   
     if branch == None: 
      pass 
     else: 
      if branch.value == value: 
       self.found = True     
      else: 
       self.findAtBranch(branch.left, value) 
       self.findAtBranch(branch.right, value)   

    def insertAtBranch(self, branch, value): 
     if branch == None: 
      return Branch(value) 
     else: 
      if self.lessThan(branch.value, value): 
       branch.right = self.insertAtBranch(branch.right, value)    
      else: 
       branch.left = self.insertAtBranch(branch.left, value) 
      return branch 

def loadTree(filename, treeType): 

    if treeType == setPhrasesTree: 
     for sentence in fileinput.input("setPhrases.txt"): 
      print(sentence) 
      setPhrases.insert(sentence[:-1]) 


def findSentenceType(sentence): 

    if sentence.exists(sentence): 
     return setPhrasesTree

以下是文本文件的樣子。記住裸，這是有意佈置這樣，而不是用量值旁邊（文件名= setPhrases.txt）：

Hi my name is Dave. 
Thank-You. 
What is your name? 
I have done all my homework. 
What time is dinner? 
What is your name? 
Thank-You. 
Hi my name is Dave. 
What is your name? 
I have done all my homework. 
What is your name? 
Can you bring me a drink Please? 
Can you bring me a drink Please? 
What is your name? 
Hi my name is Dave. 
What is your name? 
Can you bring me a drink Please?

這裏就是我試圖讓我的代碼來執行。我需要它來認識到文件中的第一句話是起始節點。然後它需要統計所有其他相同的句子，併爲該句子添加一個值，並使用該樹來完成此操作。（我原來做過這種以另一種方式，但是我需要用一棵樹能夠吻合起來，做所有其他的東西），這是我的意思是： enter image description here

然後我希望能夠返回具有最高頻率的前3個Phrases。所以在這種情況下，系統將返回句子（按此順序）：

What is your name? 
Hi my name is Dave. 
Can you bring me a drink please?

任何幫助，非常感謝。也感謝你的時間。

來源

2014-02-05 PythonNovice

我理解正確嗎，你只是想計算每行在文件中出現的頻率？你幾乎不需要一棵樹。 – pentadecagon

@五角大樓正如我前面提到的，我已經能夠做到這一點。然而，我需要使用樹來做到這一點，我不知道下一步該做什麼。 – PythonNovice

需要樹嗎？所以這是一個練習？可以肯定的是，因爲您知道，通過使用字典而不是樹，可以在大約20行代碼中更高效地解決此問題。如果你真的想要一棵樹，那麼爲了讓樹有用，它可能應該是某種[自平衡樹]（http://en.wikipedia.org/wiki/Self-balancing_binary_search_tree），最流行的這裏是[紅黑樹]（http://en.wikipedia.org/wiki/Red-black_tree）。這是您自己實施的很多工作。 – pentadecagon

在這裏，一個使用字典的實現。這是你想要的嗎？

import collections 
def count_lines(): 
    d = collections.defaultdict(int) 
    for line in open("phrases.txt"): 
     d[ line.strip() ] += 1 

    # we use the negative count as sort key, so the biggest ends up first 
    a = sorted(d.items(), key=lambda x : -x[1]) 
    for n, u in enumerate(a[:3]): 
     print(u[0], "# count=", u[1]) 

count_lines()

來源

2014-02-07 14:32:54 pentadecagon

在python中使用樹來獲取值

回答

相關問題