2015-08-20 96 views
2

我想解決與Pandas remove null values when to_json類似的問題。json dumps TypeError:鍵必須是帶字典的字符串

我的解決辦法是

  1. NaN值轉換數據幀時,快譯通,然後
  2. 轉換字典使用json.dumps()

這裏是我的代碼和錯誤JSON:

In [9]:df 

Out[9]: 
    101 102 
    a 123 NaN 
    b 234 234 
    c NaN 456 

In [10]:def to_dict_dropna(data): 
      return dict((k, v.dropna().to_dict()) for k, v in compat.iteritems(data)) 

In [47]:k2 = to_dict_dropna(df) 
In [48]:k2 
Out[48]:{101: {'a': 123.0, 'b': 234.0}, 102: {'b': 234.0, 'c': 456.0}} 
In [49]:json.dumps(k2) 
--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 
<ipython-input-76-f0159cf5a097> in <module>() 
----> 1 json.dumps(k2) 

C:\Python27\lib\json\__init__.pyc in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, encoding, default, sort_keys, **kw) 
    241   cls is None and indent is None and separators is None and 
    242   encoding == 'utf-8' and default is None and not sort_keys and not kw): 
--> 243   return _default_encoder.encode(obj) 
    244  if cls is None: 
    245   cls = JSONEncoder 

C:\Python27\lib\json\encoder.pyc in encode(self, o) 
    205   # exceptions aren't as detailed. The list call should be roughly 
    206   # equivalent to the PySequence_Fast that ''.join() would do. 
--> 207   chunks = self.iterencode(o, _one_shot=True) 
    208   if not isinstance(chunks, (list, tuple)): 
    209    chunks = list(chunks) 

C:\Python27\lib\json\encoder.pyc in iterencode(self, o, _one_shot) 
    268     self.key_separator, self.item_separator, self.sort_keys, 
    269     self.skipkeys, _one_shot) 
--> 270   return _iterencode(o, 0) 
    271 
    272 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr, 

TypeError: keys must be a string 

但它工作,如果我直接初始化ialize字典:

In [65]:k = {101: {'a': 123.0, 'b': 234.0}, 102: { 'b': 234.0, 'c': 456.0}} 
In [66]:k == k2 
Out[66]:True 
In [63]:json.dumps(k) 
Out[63]:'{"101": {"a": 123.0, "b": 234.0}, "102": {"c": 456.0, "b": 234.0}}' 

我的代碼有什麼問題?

+0

有趣的是,我曾預料這兩本詞典都會失敗。解決方法是使用'dict((str(k),v.dropna()。to_dict())for k,v in compat.iteritems(data))'(或'{str(k) :v.dropna()。to_dict())for k,v in compat.iteritems(data)}'使用dict理解符號)。 –

+1

JSON C源代碼顯式測試'int','long','float'和'bool'鍵,將所有這些鍵轉換爲字符串。這意味着你的鍵不是真正的整數,而只是* mimic *整數(它們的表示是相同的,它們測試相等,但是'isinstance(int,key)'失敗)。 –

回答

2

你的熊貓數據框中的「整數」並不是真正的整數。它們是float64對象,請參見Pandas Gotchas documentation

你必須將它們轉換回到int()對象,或將其直接轉換爲字符串:

def to_dict_dropna(data): 
    return {int(k): v.dropna().astype(int).to_dict() for k, v in compat.iteritems(data)} 

不前。

+0

謝謝Martijin。這回答了我的問題。 – cssmlulu

相關問題