我有一個字典是這樣的:如何用這些函數創建字典的詞典?
dict = {in : [0.01, -0.07, 0.09, -0.02], and : [0.2, 0.3, 0.5, 0.6], to : [0.87, 0.98, 0.54, 0.4]}
欲計算,我有一個餘弦相似度函數,它接受兩個向量每個單詞之間的餘弦相似性。首先,它將爲'in'和'and'帶來價值,然後它應該爲'in'和'to'等等帶來價值。
我希望它將結果存儲在另一個字典中,其中'in'應該是關鍵字,值應該是每個計算的餘弦相似度值與該關鍵字的字典。就像我所要的輸出是這樣的:
{in : {and : 0.4321, to : 0.218}, and : {in : 0.1245, to : 0.9876}, to : { in : 0.8764, and : 0.123}}
下面是這是做這一切的代碼:
def cosine_similarity(vec1,vec2):
sum11, sum12, sum22 = 0, 0, 0
for i in range(len(vec1)):
x = vec1[i]; y = vec2[i]
sum11 += x*x
sum22 += y*y
sum12 += x*y
return sum12/math.sqrt(sum11*sum22)
def resultInDict(result,name,value,keyC):
new_dict={}
new_dict[keyC]=value
if name in result:
result[name] = new_dict
else:
result[name] = new_dict
def extract():
result={}
res={}
with open('file.txt') as text:
for line in text:
record = line.split()
key = record[0]
values = [float(value) for value in record[1:]]
res[key] = values
for key,value in res.iteritems():
temp = 0
for keyC,valueC in res.iteritems():
if keyC == key:
continue
temp = cosine_similarity(value,valueC)
resultInDict(result,key,temp,keyC)
print result
但是,它給的結果是這樣的:
{'and': {'in': 0.12241083209661485}, 'to': {'in': -0.0654517869126785}, 'from': {'in': -0.5324142931780856}, 'in': {'from': -0.5324142931780856}}
我希望它是這樣的:
{in : {and : 0.4321, to : 0.218}, and : {in : 0.1245, to : 0.9876}, to : { in : 0.8764, and : 0.123}}
我覺得這是因爲在resultInDict函數中,我定義了一個新字典new_dict來爲內部字典添加鍵值,但每次調用resultInDict函數時,都會清空該行上的new_dict new_dict={}
,並且只添加一個鍵值對。
我該如何解決這個問題?