2016-01-06 169 views
1

我有以下的解釋:計算概率在概率樹

dict = {1000021: [[0.6, [1000024, 1, -2]], [0.4, [1000022, 21]]], 
     1000024: [[0.7, [1000022, 11, -12]], [0.3, [1000022, 2, -1]]]} 

對應於以下概率樹:

enter image description here

1000021開始,我現在需要計算所有概率和我得到的每個可能終點的數字列表。每當有一個字典條目的數字時,我需要遵循該路徑。字典可以有隨機數量的條目和隨機數量的子列表。所需的輸出:

[0.4, [1000022, 21]], 
[0.42, [1000022, 11, -12, 1, -2]], 
[0.18, [1000022, 2, -1, 1, -2] 

我試着用遞歸函數來做到這一點,但都無濟於事。任何幫助表示讚賞。

編輯:

我在我的第一個例子並不清楚,因爲它可能導致的假設,只有在子列表中的第一個元素可以有一個字典條目,而所有的人竟可以有一。 Copperfield給出的答案適用於上面的例子,但它不適用於例如

mydata = {1: [[.9, [2,3]], [.1, [4,5]]], 
      4: [[.2, [6,7]], [.5, [8,9]], [.3, [10,11,12]]], 
      5: [[.4, [13,14]], [.6, [15,16]]]} 

,我希望可以將輸出爲:

[0.9, [2, 3]], 
[0.008, [6, 7, 13, 14]], 
[0.012, [6, 7, 15, 16]], 
[0.02, [8, 9, 13, 14]], 
[0.03, [8, 9, 15, 16]], 
[0.012, [10, 11, 12, 13, 14]], 
[0.018, [10, 11, 12, 15, 16]] 
+0

看上去並不像一棵樹,我... – Copperfield

+1

有一個比答案更簡單的方式,如果我不得不寫的演示代碼的時間:樹狀數組。它們對於前綴總和很有用,這正是我想要的。另外,將新值插入Fenwick樹並更新相應的前綴和是〜4行代碼。如果我今天晚些時候有空,我會試着爲你發佈一些代碼。 –

+0

@ScottM我看着Fenwick樹,它可能是我正在尋找的。我不明白他們足夠確定。如果您有時間向我展示一個簡單的例子,我將不勝感激。 – pfnuesel

回答

1

從樹上那麼遠,但如何對這種

import copy 

mydata = {1000024: [[0.7, [1000022, 11, -12]], [0.3, [1000022, 2, -1]]], 
      1000021: [[0.6, [1000024, 1, -2]], [0.4, [1000022, 21]]]} 

def prob_tree(data,ini,prob=1): 
    data=copy.deepcopy(data) 
    val=data.pop(ini,None) 
    if val: 
     for lst in val: 
      if lst[1][0] in data: 
       extra=lst[1][1:] 
       for x in data[lst[1][0]]: 
        x[1].extend(extra) 
       prob_tree(data,lst[1][0],lst[0]) 
      else: 
       print(prob*lst[0],lst[1]) 

prob_tree(mydata,1000021) 

輸出

0.42 [1000022, 11, -12, 1, -2] 
0.18 [1000022, 2, -1, 1, -2] 
0.4 [1000022, 21] 

編輯

在靈感突發,並使用此功能的風格一點的是新版本

import itertools, functools 

def partition(pred, iterable): 
    'Use a predicate to partition entries into false entries and true entries' 
    # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9 
    # Direct from the recipes in itertools documentation 
    t1, t2 = itertools.tee(iterable) 
    return itertools.filterfalse(pred, t1), filter(pred, t2) 


def prob_tree(data,ini) -> (float,tuple): 
    """Generator of all end points of the probability tree contained 
     in data, starting with ini""" 
    for prob,path in data[ini]: 
     no_more,more = map(tuple,partition(lambda x: x in data, path)) 
     if more: 
      for node in itertools.product(*[prob_tree(data,x) for x in more]): 
       new_prob,new_path = functools.reduce(lambda acum,new: (acum[0]*new[0],acum[1]+new[1]),node,(prob,tuple())) 
       yield new_prob, no_more + new_path 
     else: 
      yield prob, no_more 

mydata = {1: [[.9, [2,3]], [.1, [4,5]]], 
      4: [[.2, [6,7]], [.5, [8,9]], [.3, [10,11,12]]], 
      5: [[.4, [13,14]], [.6, [15,16]]] 
      } 

mydata2 = {1: [[.8, [2,3]], [.1, [4,5]],[.05, [2,4]],[.05,[5,6]] ], 
      4: [[.2, [6,7]], [.5, [8,9]], [.3, [10,11,12]]], 
      5: [[.4, [13,14]], [.6, [15,16]]] 
      } 

mydata3 = {1: [[.8, [2,3]], [.1, [4,5]],[.05, [2,4]],[.05,[5,6]] ], 
      4: [[.2, [6,7]], [.5, [8,9]], [.3, [10,11,12]]], 
      5: [[.4, [13,14]], [.6, [15,16]]], 
      13:[[.58,[23,32]],[.42,[42]] ], 
      16:[ [.9,[17,18]], [.1,[20,21]] ], 
      } 

輸出

>>> for x in prob_tree(mydata,1): 
    print(x) 


(0.9, (2, 3)) 
(0.008000000000000002, (6, 7, 13, 14)) 
(0.012000000000000002, (6, 7, 15, 16)) 
(0.020000000000000004, (8, 9, 13, 14)) 
(0.03, (8, 9, 15, 16)) 
(0.012, (10, 11, 12, 13, 14)) 
(0.018, (10, 11, 12, 15, 16)) 
>>> 
>>> 
>>> for x in prob_tree(mydata2,1): 
    print(x) 


(0.8, (2, 3)) 
(0.008000000000000002, (6, 7, 13, 14)) 
(0.012000000000000002, (6, 7, 15, 16)) 
(0.020000000000000004, (8, 9, 13, 14)) 
(0.03, (8, 9, 15, 16)) 
(0.012, (10, 11, 12, 13, 14)) 
(0.018, (10, 11, 12, 15, 16)) 
(0.010000000000000002, (2, 6, 7)) 
(0.025, (2, 8, 9)) 
(0.015, (2, 10, 11, 12)) 
(0.020000000000000004, (6, 13, 14)) 
(0.03, (6, 15, 16)) 
>>> 
>>> 
>>> 
>>> for x in prob_tree(mydata3,1): 
    print(x) 


(0.8, (2, 3)) 
(0.004640000000000001, (6, 7, 14, 23, 32)) 
(0.003360000000000001, (6, 7, 14, 42)) 
(0.010800000000000002, (6, 7, 15, 17, 18)) 
(0.0012000000000000001, (6, 7, 15, 20, 21)) 
(0.0116, (8, 9, 14, 23, 32)) 
(0.008400000000000001, (8, 9, 14, 42)) 
(0.027000000000000003, (8, 9, 15, 17, 18)) 
(0.003, (8, 9, 15, 20, 21)) 
(0.006959999999999999, (10, 11, 12, 14, 23, 32)) 
(0.00504, (10, 11, 12, 14, 42)) 
(0.0162, (10, 11, 12, 15, 17, 18)) 
(0.0018, (10, 11, 12, 15, 20, 21)) 
(0.010000000000000002, (2, 6, 7)) 
(0.025, (2, 8, 9)) 
(0.015, (2, 10, 11, 12)) 
(0.0116, (6, 14, 23, 32)) 
(0.008400000000000001, (6, 14, 42)) 
(0.027000000000000003, (6, 15, 17, 18)) 
(0.003, (6, 15, 20, 21)) 
>>> 

EDIT 2 將檢查循環引用

def prob_tree_with_check(data,ini,visited=frozenset()): 
    """Generator of all end points of the probability tree contained 
     in data, starting with ini. Check if a previously visited branch 
     of the tree is visited again and raise RuntimeError in that case""" 
    if ini in visited: 
     raise RuntimeError("Branch allready visited: %r"%ini) 
    visited = visited.union((ini,)) 
    for prob,path in data[ini]: 
     no_more,more = map(tuple,partition(lambda x: x in data,path)) 
     if more: 
      for node in itertools.product(*[prob_tree_with_check(data,x,visited) for x in more]): 
       new_prob,new_path = functools.reduce(lambda acum,new: (acum[0]*new[0],acum[1]+new[1]),node,(prob,tuple())) 
       yield new_prob, no_more + new_path 
     else: 
      yield prob, no_more 

mydata_bad = {1: [[.9, [2,3]], [.1, [4,5]]], 
      4: [[.2, [6,7]], [.5, [8,9]], [.3, [10,11,12]]], 
      5: [[.4, [13,14]], [.6, [15,16,1]]] # <-- try to go back to 1 
      } 

輸出

>>> for x in prob_tree_with_check(mydata_bad,1): 
    x 


(0.9, (2, 3)) 
Traceback (most recent call last): 
    File "<pyshell#35>", line 1, in <module> 
    for x in prob_tree_with_check(mydata_bad,1): 
    File "C:\Users\David\Documents\Python Scripts\stackoverflow_test.py", line 137, in prob_tree_with_check 
    for node in itertools.product(*[prob_tree_with_check(data,x,visited) for x in more]): 
    File "C:\Users\David\Documents\Python Scripts\stackoverflow_test.py", line 137, in prob_tree_with_check 
    for node in itertools.product(*[prob_tree_with_check(data,x,visited) for x in more]): 
    File "C:\Users\David\Documents\Python Scripts\stackoverflow_test.py", line 132, in prob_tree_with_check 
    raise RuntimeError("Branch already visited: %r"%ini) 
RuntimeError: Branch already visited: 1 
>>>   
+0

非常感謝。我將不得不用更多的數據進行測試,但可能會接受你的答案。你的意思是什麼不看起來像一棵樹?輸入數據?當我談到樹時,我的意思是概率樹,也許我在這裏使用了錯誤的術語。 – pfnuesel

+0

我的意思是我剛纔在編輯時加入了 – Copperfield

+0

我剛剛意識到,當有額外的字典時,這不起作用,例如, '11'。 – pfnuesel