如何使最長節點的遞歸搜索更高效？

我正在試圖找到定向非循環圖中最長的路徑。目前，我的代碼似乎運行時間複雜度爲O（n ）。如何使最長節點的遞歸搜索更高效？

圖爲輸入{0: [1,2], 1: [2,3], 3: [4,5] }

#Input: dictionary: graph, int: start, list: path 
#Output: List: the longest path in the graph (Recurrance) 
# This is a modification of a depth first search 
def find_longest_path(graph, start, path=[]): 
    path = path + [start] 
    paths = path 
    for node in graph[start]: 
     if node not in path: 
      newpaths = find_longest_path(graph, node, path) 
      #Only take the new path if its length is greater than the current path 
      if(len(newpaths) > len(paths)): 
       paths = newpaths 

    return paths

它在形式返回節點的列表的例如[0,1,3,5]

我怎樣才能讓這個比爲O（n ）更有效率？遞歸是解決這個問題的正確方法，還是應該使用不同的循環？

來源

2015-10-18 AucklandInvader

你可以運行這個通過探查器來看看性能的痛點是什麼？ – Makoto

注意：[使用空列表作爲默認參數並不能滿足許多新的Python程序員最初的期望]（http://docs.python-guide.org/en/latest/writing/gotchas/#mutable-默認參數） – chucksmash

首先，修改默認參數爲chucksmash建議。其次，添加代碼來跟蹤每個訪問節點的最長已知路徑，而不僅僅是當前路徑。即使當前路徑（可能）被認爲較差，您當前的算法也會查找圖中的每條路徑。 – Prune

您可以在O（n + e）（即節點數+邊的線性）中解決此問題。

這個想法是，你首先創建一個拓撲排序（我是Tarjan's algorithm的粉絲）和一組反向邊緣。如果您可以分解問題以利用現有解決方案，它總是有幫助的。

然後，您將向後推送到每個父節點的拓撲排序其子節點距離+ 1（如果存在多個路徑，則保持最大值）。跟蹤迄今爲止所見最大距離的節點。

當你用完全的距離註釋完所有的節點後，你可以從距離最遠的節點開始，這個距離將是你最長的路徑根，然後選擇剛好小於一個數的子節點當前節點（因爲它們位於關鍵路徑上）。

一般來說，當試圖找到最佳複雜度算法時，不要害怕依次運行多個階段。五爲O（n）算法運行順序仍然是爲O（n），仍然是優於爲O（n ）從複雜的角度（雖然它可能取決於不變成本會更糟實際運行時間/因素和尺寸n）。

ETA：我剛剛注意到你有一個開始節點。這使得它只是一個深度優先搜索，並保持迄今爲止所見最長的解決方案，無論如何這只是O（n + e）。遞歸是好的，或者你可以保留訪問節點的列表/堆棧（每當你回溯時找到下一個孩子時你必須小心）。

當您從深度初始搜索回溯時，您需要存儲從該節點到葉的最長路徑，以便不重新處理任何子樹。這也將用作visited標誌（即，除了在遞歸之前做node not in path測試還具有node not in subpath_cache測試）。除了存儲子路徑之外，您可以存儲長度，然後基於上面討論的順序值（關鍵路徑）完成一次完成的路徑。

ETA2：下面是一個解決方案。

def find_longest_path_rec(graph, parent, cache): 
    maxlen = 0 
    for node in graph[parent]: 
     if node in cache: 
      pass 
     elif node not in graph: 
      cache[node] = 1 
     else: 
      cache[node] = find_longest_path_rec(graph, node, cache) 

     maxlen = max(maxlen, cache[node]) 

    return maxlen + 1 

def find_longest_path(graph, start): 
    cache = {} 
    maxlen = find_longest_path_rec(graph, start, cache) 
    path = [start] 
    for i in range(maxlen-1, 0, -1): 
     for node in graph[path[-1]]: 
      if cache[node] == i: 
       path.append(node) 
       break 
     else: 
      assert(0) 
    return path

請注意，我已經刪除了node not in path測試，因爲我假設你如權利實際上是提供一個DAG。如果你想要檢查你應該提出一個錯誤，而不是忽略它。另請注意，我已將斷言添加到for的else子句中，以記錄我們必須始終在路徑中找到有效的下一個（順序）節點。

ETA3：最後for循環有點混亂。我們正在考慮的是，在關鍵路徑中，所有節點距離必須是連續的。考慮節點0是距離4，節點1是距離3，節點2是距離1.如果我們的路徑開始於[0, 2, ...]，我們有一個矛盾，因爲節點0不是離一個葉子比2更遠。

來源

2015-10-19 00:36:11 Parakleta

感謝您 - 我真的很感激您花時間幫助我。所以只需確認我應該運行一個完整的DFS並從起始節點中找到最長的長度，然後遍歷其餘部分以確認它們都較小？這很明顯，比我的實現快得多，但爲什麼我需要以這種方式回溯 – AucklandInvader

@AucklandInvader不，你沒有找到最長的時間，然後檢查其他人更短。你正在做一個DFS在圖中生成註釋，然後提取最長的路徑。我總是發現圖算法最容易這樣做，首先是註釋掃描，然後是結果提取。 – Parakleta

有幾個非算法改進，我建議（這些都與Python代碼質量）：

def find_longest_path_from(graph, start, path=None): 
    """ 
    Returns the longest path in the graph from a given start node 
    """ 
    if path is None: 
     path = [] 
    path = path + [start] 
    max_path = path 
    nodes = graph.get(start, []) 
    for node in nodes: 
     if node not in path: 
      candidate_path = find_longest_path_from(graph, node, path) 
      if len(candidate_path) > len(max_path): 
       max_path = candidate_path 
    return max_path 

def find_longest_path(graph): 
    """ 
    Returns the longest path in a graph 
    """ 
    max_path = [] 
    for node in graph: 
     candidate_path = find_longest_path_from(graph, node) 
     if len(candidate_path) > len(max_path): 
      max_path = candidate_path 
    return max_path

變化解釋說：

def find_longest_path_from(graph, start, path=None): 
    if path is None: 
     path = []

我已將find_longest_path更名爲find_longest_path_from以更好地解釋它的功能。
更改path參數的默認參數值爲None而不是[]。除非你知道你會從中受益，否則你想避免在Python中使用可變對象作爲默認參數。這意味着您通常應該默認將path設置爲None，然後在調用該函數時，檢查是否path is None並相應地創建一個空列表。

max_path = path 
... 
candidate_path = find_longest_path_from(graph, node, path) 
...

我你的變量的名稱更新從paths到max_path和newpaths到candidate_path。這些令人困惑的變量名稱是因爲它們提到了複數路徑 - 暗示它們存儲的值包含多條路徑 - 實際上它們都只是一條路徑。我試圖給他們更多描述性的名字。您例如輸入

nodes = graph.get(start, []) 
for node in nodes:

代碼中的錯誤，因爲圖形的葉節點不在dict所以graph[start]鍵將提高KeyError時start是2，例如。這通過返回一個空列表來處理start不是中的鍵的情況。

def find_longest_path(graph): 
    """ 
    Returns the longest path in a graph 
    """ 
    max_path = [] 
    for node in graph: 
     candidate_path = find_longest_path_from(graph, node) 
     if len(candidate_path) > len(max_path): 
      max_path = candidate_path 
    return max_path

的方法，發現在圖的最長路徑的鑰匙迭代。這與find_longest_path_from的算法分析完全分離，但我想包含它。

來源

2015-10-19 00:39:21 chucksmash

如何使最長節點的遞歸搜索更高效？

回答

相關問題