0
我需要計算列表中存在其相應的unigrams的二元組的概率。例如,期望的結果是在下面的列表中,'pretty girl', 'pretty', 'girl'
都存在。因此,該概率是,通過在列表P
使用的值,(0.0017) % (0.003 * 0.002) = 5.999999999999987e-06
在列表中選擇特殊元素並計算條件概率
S = ['girl', 'pretty', 'pretty girl', 'our', 'our world', 'wide', 'word', 'yes', 'yike', 'yummy']
P = [0.003, 0.002, 0.0017, 0.003, 0.006, 0.004, 0.002, 0.012, 0.006, 0.003]
我有以下代碼。它似乎沒有給我結果,因此我不能繼續計算概率。我試圖用這個代碼做的是在列表中選擇bigrams並找到它們相應的unigrams。然後我打算在P
中匹配他們的概率。
In [60]: import re
In [61]: M = []
In [62]: for i in range(len(S)):
s_split = S[i].split()
s_split_len = len(S[i].split())
if s_split_len == 2:
m = []
a = re.match(s_split[0], S[i])
b = re.match(s_split[1], S[i])
m.append(a)
m.append(b)
M.append(m)
print M
[[<_sre.SRE_Match object at 0x10447b988>, None], [<_sre.SRE_Match object at 0x10447b8b8>, None], [<_sre.SRE_Match object at 0x10447b920>, None], [<_sre.SRE_Match object at 0x10447b9f0>, None], [<_sre.SRE_Match object at 0x10447bac0>, None], [<_sre.SRE_Match object at 0x10447bb90>, None], [<_sre.SRE_Match object at 0x10447bbf8>, None], [<_sre.SRE_Match object at 0x10447bc60>, None], [<_sre.SRE_Match object at 0x10447bcc8>, None], [<_sre.SRE_Match object at 0x10447bd30>, None], [<_sre.SRE_Match object at 0x10447bd98>, None], [<_sre.SRE_Match object at 0x10447be00>, None], [<_sre.SRE_Match object at 0x10447be68>, None], [<_sre.SRE_Match object at 0x10447bed0>, None], [<_sre.SRE_Match object at 0x10447bf38>, None], [<_sre.SRE_Match object at 0x1044a8030>, None], [<_sre.SRE_Match object at 0x1044a8098>, None], [<_sre.SRE_Match object at 0x1044a8100>, None], [<_sre.SRE_Match object at 0x1044a8168>, None], [<_sre.SRE_Match object at 0x1044a81d0>, None], [<_sre.SRE_Match object at 0x1044a8238>, None], [<_sre.SRE_Match object at 0x1044a82a0>, None], [<_sre.SRE_Match object at 0x1044a8308>, None], [<_sre.SRE_Match object at 0x1044a8370>, None], [<_sre.SRE_Match object at 0x1044a83d8>, None], [<_sre.SRE_Match object at 0x1044a8440>, None], [<_sre.SRE_Match object at 0x1044a84a8>, None], [<_sre.SRE_Match object at 0x1044a8510>, None], [<_sre.SRE_Match object at 0x1044a8578>, None], [<_sre.SRE_Match object at 0x1044a85e0>, None], [<_sre.SRE_Match object at 0x1044a8648>, None], [<_sre.SRE_Match object at 0x1044a86b0>, None], [<_sre.SRE_Match object at 0x1044a8718>, None]]
[[<_sre.SRE_Match object at 0x10447b988>, None], [<_sre.SRE_Match object at 0x10447b8b8>, None], [<_sre.SRE_Match object at 0x10447b920>, None], [<_sre.SRE_Match object at 0x10447b9f0>, None], [<_sre.SRE_Match object at 0x10447bac0>, None], [<_sre.SRE_Match object at 0x10447bb90>, None], [<_sre.SRE_Match object at 0x10447bbf8>, None], [<_sre.SRE_Match object at 0x10447bc60>, None], [<_sre.SRE_Match object at 0x10447bcc8>, None], [<_sre.SRE_Match object at 0x10447bd30>, None], [<_sre.SRE_Match object at 0x10447bd98>, None], [<_sre.SRE_Match object at 0x10447be00>, None], [<_sre.SRE_Match object at 0x10447be68>, None], [<_sre.SRE_Match object at 0x10447bed0>, None], [<_sre.SRE_Match object at 0x10447bf38>, None], [<_sre.SRE_Match object at 0x1044a8030>, None], [<_sre.SRE_Match object at 0x1044a8098>, None], [<_sre.SRE_Match object at 0x1044a8100>, None], [<_sre.SRE_Match object at 0x1044a8168>, None], [<_sre.SRE_Match object at 0x1044a81d0>, None], [<_sre.SRE_Match object at 0x1044a8238>, None], [<_sre.SRE_Match object at 0x1044a82a0>, None], [<_sre.SRE_Match object at 0x1044a8308>, None], [<_sre.SRE_Match object at 0x1044a8370>, None], [<_sre.SRE_Match object at 0x1044a83d8>, None], [<_sre.SRE_Match object at 0x1044a8440>, None], [<_sre.SRE_Match object at 0x1044a84a8>, None], [<_sre.SRE_Match object at 0x1044a8510>, None], [<_sre.SRE_Match object at 0x1044a8578>, None], [<_sre.SRE_Match object at 0x1044a85e0>, None], [<_sre.SRE_Match object at 0x1044a8648>, None], [<_sre.SRE_Match object at 0x1044a86b0>, None], [<_sre.SRE_Match object at 0x1044a8718>, None], [<_sre.SRE_Match object at 0x1044a8780>, None]]
感謝@Chris_Rands。這個例子在我的描述中給出(帖子的前四行)。示例數據是列表S和P.代碼的輸出是帖子最後部分中的對象列表。 – achimneyswallow