如何將字典作爲值插入Python中使用循環的字典

我目前面臨一個問題，使我的CVS數據字典。如何將字典作爲值插入Python中使用循環的字典

我有3列，我想在文件中使用：

userID, placeID, rating 
U1000, 12222, 3 
U1000, 13333, 2 
U1001, 13333, 4

我想作的結果是這樣的：

{'U1000': {'12222': 3, '13333': 2}, 
'U1001': {'13333': 4}}

也就是說，我想使我的數據結構看起來像：

sample = {} 
sample["U1000"] = {} 
sample["U1001"] = {} 
sample["U1000"]["12222"] = 3 
sample["U1000"]["13333"] = 2 
sample["U1001"]["13333"] = 4

但我有很多數據是親cessed。我想獲得與循環的結果，但我已經嘗試過了2小時，失敗..

---以下代碼可以迷惑你---

我的結果看現在這個樣子：

{'U1000': ['12222', 3], 
'U1001': ['13333', 4]}

該字典的值是一個列表，而一本字典
用戶「U1000」出現多次，但在我孤單的結果只有一次

我想我的代碼有很多錯誤..如果你不介意的話，請看看：

reader = np.array(pd.read_csv("rating_final.csv")) 
included_cols = [0, 1, 2] 

sample= {} 
target=[] 
target1 =[] 
for row in reader: 
     content = list(row[i] for i in included_cols) 
     target.append(content[0]) 
     target1.append(content[1:3]) 

sample = dict(zip(target, target1))

我怎麼能提高代碼？我已經看過通過計算器，但由於個人缺乏能力，任何人都可以請幫助我呢？

非常感謝！

來源

2016-03-02 Leigh Tsai

這似乎是你想要的字典作爲_values_ ，而不是_keys_。也許正確的標題匹配？ – ShadowRanger

謝謝你的提醒。已更正標題以及內容！ –

另外，你的例子有'{'U1000'：{'12222'：3}，{'1333'：2}，'U1001'：{'13333'：4}}'，但是這是'U1000'和' U1001'，但沒有與{{1333'：2}'相關聯的鍵（或無值）。你可以有'{'U1000'：{'12222'：3，'1333'：2}，'U1001'：{'13333'：4}}'或'{'U1000'：[{'12222'： 3}，{'1333'：2}]，'U1001'：[{'13333'：4}]}'，但不是你提供的。 – ShadowRanger

這應該做你想要什麼：

import collections 

reader = ... 
sample = collections.defaultdict(dict) 

for user_id, place_id, rating in reader: 
    rating = int(rating) 
    sample[user_id][place_id] = rating 

print(sample) 
# -> {'U1000': {'12222': 3, '1333': 2}, 'U1001': {'13333': 4}}

defaultdict是一個方便的工具，只要您試圖訪問一個關鍵，是不是在字典中提供的默認值。如果你（因爲你要sample['non-existent-user-id]失敗，KeyError例如）不喜歡它，使用：

reader = ... 
sample = {} 

for user_id, place_id, rating in reader: 
    rating = int(rating) 
    if user_id not in sample: 
     sample[user_id] = {} 
    sample[user_id][place_id] = rating

來源

2016-03-02 18:16:35

感謝您的澄清，這真的有幫助！ –

例子中的預期輸出是不可能的，因爲{'1333': 2}不會與一個鍵關聯。你可以得到{'U1000': {'12222': 3, '1333': 2}, 'U1001': {'13333': 4}}雖然與dict的dict一個S：

sample = {} 
for row in reader: 
    userID, placeID, rating = row[:3] 
    sample.setdefault(userID, {})[placeID] = rating # Possibly int(rating)?

或者，使用collections.defaultdict(dict)以避免涉及setdefault（或其他方法需要一個try/except KeyError或if userID in sample:在交換犧牲setdefault的原子爲不產生空dict小號不必要地）：

import collections 

sample = collections.defaultdict(dict) 
for row in reader: 
    userID, placeID, rating = row[:3] 
    sample[userID][placeID] = rating 

# Optional conversion back to plain dict 
sample = dict(sample)

轉換回普通dict確保將來升ookups不會自動生動化按鍵，正常情況下會提升KeyError，如果您print那麼它看起來像正常的dict。

如果included_cols是很重要的（因爲名字或列索引可能會發生變化），則可以使用operator.itemgetter加快和簡化一次提取所有所需的列：

from collections import defaultdict 
from operator import itemgetter 

included_cols = (0, 1, 2) 
# If columns in data were actually: 
# rating, foo, bar, userID, placeID 
# we'd do this instead, itemgetter will handle all the rest: 
# included_cols = (3, 4, 0) 
get_cols = itemgetter(*included_cols) # Create function to get needed indices at once 

sample = defaultdict(dict) 
# map(get_cols, ...) efficiently converts each row to a tuple of just 
# the three desired values as it goes, which also lets us unpack directly 
# in the for loop, simplifying code even more by naming all variables directly 
for userID, placeID, rating in map(get_cols, reader): 
    sample[userID][placeID] = rating # Possibly int(rating)?

來源

2016-03-02 18:22:06 ShadowRanger

感謝您的回答，這真的有幫助！ –

如何將字典作爲值插入Python中使用循環的字典

回答

相關問題