如何解決以下問題將字典寫入csv？

-1

你好我正在使用sklearn並使用kmeans進行自然語言處理，我使用Kmeans從註釋創建聚類，然後創建一個字典，其中聚類的數量作爲一個關鍵字並將相關注釋列表作爲值關聯，如下所示：如何解決以下問題將字典寫入csv？

dict_clusters = {} 
for i in range(0,len(kmeans.labels_)): 
    #print(kmeans.labels_[i]) 
    #print(listComments[i]) 
    if not kmeans.labels_[i] in dict_clusters: 
     dict_clusters[kmeans.labels_[i]] = [] 
    dict_clusters[kmeans.labels_[i]].append(listComments[i]) 
print("dictionary constructed")

我還想寫本字典我嘗試了CSV：

Out = open("dictionary.csv", "wb") 
w = csv.DictWriter(Out,dict_clusters.keys()) 
w.writerows(dict_clusters) 
Out.close()

但是我不知道爲什麼是錯誤的，因爲我得到了下面的錯誤，而且我不知道，如果這個錯誤與numpy有關，因爲kmeans.labels_包含多個值，

Traceback (most recent call last): 
    File "C:/Users/CleanFile.py", line 133, in <module> 
    w.writerows(dict_clusters) 
    File "C:\Program Files\Anaconda3\lib\csv.py", line 156, in writerows 
    return self.writer.writerows(map(self._dict_to_list, rowdicts)) 
    File "C:\Program Files\Anaconda3\lib\csv.py", line 146, in _dict_to_list 
    wrong_fields = [k for k in rowdict if k not in self.fieldnames] 
TypeError: 'numpy.int32' object is not iterable

我想體會這種支持，我希望得到一個csv用我的字典如下：

key1, value 
key2, value 
. 
. 
. 
keyN, value

從這裏的反饋之後，我嘗試：

with open("dictionary.csv", mode="wb") as out_file: 
    writer = csv.DictWriter(out_file, headers=dict_clusters.keys()) 
    writer.writerow(dict_clusters)

我：

Traceback (most recent call last): 
    File "C:/Users/CleanFile.py", line 129, in <module> 
    writer = csv.DictWriter(out_file, headers=dict_clusters.keys()) 
TypeError: __init__() missing 1 required positional argument: 'fieldnames'

attempt2：

Out = open("dictionary.csv", "wb") 
w = csv.DictWriter(Out,dict_clusters.keys()) 
w.writerows([dict_clusters]) 
Out.close()

輸出：

Traceback (most recent call last): 
    File "C:/Users/CleanFile.py", line 130, in <module> 
    w.writerows([dict_clusters]) 
    File "C:\Program Files\Anaconda3\lib\csv.py", line 156, in writerows 
    return self.writer.writerows(map(self._dict_to_list, rowdicts)) 
TypeError: a bytes-like object is required, not 'str'

attempt3，這種嘗試需要花費大量的時間計算輸出：

Out = open("dictionary.csv", "wb") 
w = csv.DictWriter(Out,dict_clusters.keys()) 
w.writerow(dict_clusters) 
Out.close()

，我使用的Python的版本如下：

3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)] 
3.5.2

經過很多次嘗試，我決定使用更好的WA y以建立我的字典如下：

from collections import defaultdict 
pairs = zip(y_pred, listComments) 

dict_clusters2 = defaultdict(list) 

for num, comment in pairs: 
    dict_clusters2[num].append(comment)

然而，似乎有些角色正在失敗的CSV文件的創建如下：

with open('dict.csv', 'w') as csv_file: 
    writer = csv.writer(csv_file) 
    for key, value in dict_clusters2.items(): 
     writer.writerow([key, value])

輸出：

Traceback (most recent call last): 
    File "C:/Users/CleanFile.py", line 146, in <module> 
    writer.writerow([key, value]) 
    File "C:\Program Files\Anaconda3\lib\encodings\cp1252.py", line 19, in encode 
    return codecs.charmap_encode(input,self.errors,encoding_table)[0] 
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f609' in position 6056: character maps to <undefined>

在爲了更清楚我進行了：

for k,v in dict_clusters2.items(): 
    print(k, v)

而且我得到這樣的：

1 ['hello this is','the car is red',....'performing test'] 
2 ['we already have','another comment',...'strings strings'] 
. 
. 
19 ['we have',' comment music',...'strings strings dance']

我的字典裏有一個關鍵，我想有一個CSV如下的幾點意見列表：

1,'hello this is','the car is red',....'performing test' 
2,'we already have','another comment',...'strings strings' 
. 
. 
19,'we have',' comment music',...'strings strings dance'

但似乎有些字符不是配合良好，一切都失敗了，我希望得到支持，感謝您的支持。

來源

2016-12-17 neo33

無關的問題：你可能想看看[ 'enumerate']（https://docs.python.org/3.5/library/functions.html#enumerate）和['dict.setdefault']（https://docs.python.org/3.5/library/stdtypes。 html＃dict.setdefault）第一個代碼塊可以寫成類似'for i，標籤枚舉（kmeans.labels_）：dict_clusters.setdefault（label，[]）。append（listComments [i]）'（儘管最好是split成幾行） –

甚至比'枚舉'更好，在這種情況下，你m ight希望檢出[zip]（https://docs.python.org/3.5/library/functions.html#zip）以同時循環「listComments」和「kmeans.labels_'。有關循環索引的更多信息：http://treyhunner.com/2016/04/how-to-loop-with-indexes-in-python/ –

作爲'dict.setdefault'的替代方法，[collections.defaultdict（list ）]（https://docs.python.org/3.6/library/collections.html#defaultdict-examples）可以使用。我通常比'dict.setdefault'更喜歡'defaultdict'，但它們都達到了相同的目的。 –

你的特殊字符，在PY3 IPython的會議將呈現爲：

In [31]: '\U0001f609' 
Out[31]: ''

給我們的字典中的一個小樣本，或者更好的，你用它來構建它的值。

我還沒有和csv很多，csv.DictWriter甚至更少。 numpy用戶經常使用np.savetxt來寫csv文件。這在編寫純數字數組時很容易使用。如果你想編寫字符和數字列的混合，它是比較詭異的，需要使用結構化數組。

另一種選擇是直接寫一個文本文件。只需打開它，然後使用f.write(...)將格式化的行寫入文件。其實np.savetxt本質上的作用：

with open(filename, 'w') as f: 
    for row in myArray: 
     f.write(fmt % tuple(row))

savetxt構建一個fmt字符串如%s, %d, %f\n。它也適用於字節串，需要wb模式。因此，您的特殊角色可能會遇到更多問題。

它可能有助於專注於打印您的字典，一次一個鍵，例如，

for k in mydict.keys(): 
    print(`%s, %s`%(k, mydict[k]))

作爲開始。一旦你獲得了print的格式，很容易將其轉換爲文件寫入。

===============

我可以寫你的代碼的假設詞典：

In [58]: adict={1:'\U0001f609'} 
In [59]: with open('test.txt','w') as f: 
    ...:  writer=csv.writer(f) 
    ...:  for k,v in adict.items(): 
    ...:   writer.writerow([k,v]) 
    ...:   
In [60]: cat test.txt 
1,

來源

2016-12-17 22:57:55 hpaulj

感謝這打破了csv文件的一代，你知道我該如何避免這種情況？，非常感謝您的支持 – neo33

感謝您的支持我添加了更多關於如何組成我的詞典的詳細信息，如果您需要其他一些細節以幫助我，請讓我知道非常感謝，以幫助我克服這種情況 – neo33

的writerows method必須採取詞典列表：

Out = open("dictionary.csv", "wb") 
w = csv.DictWriter(Out,dict_clusters.keys()) 
w.writerows([dict_clusters]) 
Out.close()

你可能尋找writerow接受一個Dictionary對象：

Out = open("dictionary.csv", "wb") 
w = csv.DictWriter(Out,dict_clusters.keys()) 
w.writerow(dict_clusters) 
Out.close()

旁白：你可能還需要考慮使用open作爲上下文管理器（在with塊中）以確保文件已正確關閉：

with open("dictionary.csv", mode="wb") as out_file: 
    writer = csv.DictWriter(out_file, headers=dict_clusters.keys()) 
    writer.writerow(dict_clusters)

來源

2016-12-17 19:49:39

@ Trey Hunner，我嘗試了3次，我無法獲得所需的csv我不確定發生了什麼事情，我想感謝支持，非常感謝關注，我的問題更新了新的嘗試@Tadhg McDonald-Jensen， – neo33

如何解決以下問題將字典寫入csv？

回答

相關問題