2015-12-18 49 views
6

有沒有什麼辦法可以使yaml.load引發異常,只要給定的鍵在同一個字典中多次出現?如何防止在YAML中重新定義密鑰?

例如,解析以下YAML將引發異常,因爲some_key出現了兩次:

{ 
    some_key: 0, 
    another_key: 1, 
    some_key: 1 
} 

事實上,上述行爲與有關鍵重新定義最簡單的政策。例如,一個更復雜的策略可以指定,只有改變分配給鍵的值的重定義纔會導致異常,或者可以允許將鍵重定義的嚴重級別設置爲「警告」而不是「錯誤」 。等這個問題的理想答案將能夠支持這種變化。

+2

咦,看來PyYAML應該已經做到這一點(重複鍵由YAML規範不允許的),並且它不實際上[這是一個已經開放了七年的漏洞](http://pyyaml.org/ticket/128)。 –

+1

該機票已遷移[此處](https://github.com/yaml/pyyaml/issues/41)。還開着。 sadpanda.jpg – wim

回答

1

如果你想加載器拋出一個錯誤,那麼你應該定義自己的裝載機,與檢查鑰匙是否已在映射¹構造:

import collections 
import ruamel.yaml as yaml 

from ruamel.yaml.reader import Reader 
from ruamel.yaml.scanner import Scanner 
from ruamel.yaml.parser_ import Parser 
from ruamel.yaml.composer import Composer 
from ruamel.yaml.constructor import Constructor 
from ruamel.yaml.resolver import Resolver 
from ruamel.yaml.nodes import MappingNode 
from ruamel.yaml.compat import PY2, PY3 


class MyConstructor(Constructor): 
    def construct_mapping(self, node, deep=False): 
     if not isinstance(node, MappingNode): 
      raise ConstructorError(
       None, None, 
       "expected a mapping node, but found %s" % node.id, 
       node.start_mark) 
     mapping = {} 
     for key_node, value_node in node.value: 
      # keys can be list -> deep 
      key = self.construct_object(key_node, deep=True) 
      # lists are not hashable, but tuples are 
      if not isinstance(key, collections.Hashable): 
       if isinstance(key, list): 
        key = tuple(key) 
      if PY2: 
       try: 
        hash(key) 
       except TypeError as exc: 
        raise ConstructorError(
         "while constructing a mapping", node.start_mark, 
         "found unacceptable key (%s)" % 
         exc, key_node.start_mark) 
      else: 
       if not isinstance(key, collections.Hashable): 
        raise ConstructorError(
         "while constructing a mapping", node.start_mark, 
         "found unhashable key", key_node.start_mark) 

      value = self.construct_object(value_node, deep=deep) 
      # next two lines differ from original 
      if key in mapping: 
       raise KeyError 
      mapping[key] = value 
     return mapping 


class MyLoader(Reader, Scanner, Parser, Composer, MyConstructor, Resolver): 
    def __init__(self, stream): 
     Reader.__init__(self, stream) 
     Scanner.__init__(self) 
     Parser.__init__(self) 
     Composer.__init__(self) 
     MyConstructor.__init__(self) 
     Resolver.__init__(self) 



yaml_str = """\ 
some_key: 0, 
another_key: 1, 
some_key: 1 
""" 

data = yaml.load(yaml_str, Loader=MyLoader) 
print(data) 

,並拋出一個KeyError

請注意,您在示例中使用的大括號是不必要的。

我不確定這是否會與merge keys一起使用。


使用ruamel.yaml由本人筆者¹這樣做。 ruamel.yaml PyYAML的增強版本,後者的加載程序代碼應該類似。

+1

美麗的東西。我很高興看到YAML沒有被遺忘。 YAML採取了一個非常好的主意(JSON),並且使其更好10倍。我使用YAML作爲配置文件,並且我發現它處理了這個問題域中的每一個可能的需求。感謝您的回答,最重要的是,感謝您提供'ruamel.yaml'。 – kjo

+0

另外,謝謝你對多餘的大括號的線索。 – kjo

+0

此代碼在當前版本(0.15.34)上已過時。 'ModuleNotFoundError:沒有名爲'ruamel.yaml.parser_'的模塊,修正了TypeError:__init __()得到了一個意外的關鍵字參數'preserve_quotes'' – wim

1

下面是從安通的回答相當於代碼,如果你使用pyyaml:

import collections 
import yaml 
import sys 

from yaml.reader import Reader 
from yaml.scanner import Scanner 
from yaml.parser import Parser 
from yaml.composer import Composer 
from yaml.constructor import Constructor, ConstructorError 
from yaml.resolver import Resolver 
from yaml.nodes import MappingNode 


class NoDuplicateConstructor(Constructor): 
    def construct_mapping(self, node, deep=False): 
     if not isinstance(node, MappingNode): 
      raise ConstructorError(
       None, None, 
       "expected a mapping node, but found %s" % node.id, 
       node.start_mark) 
     mapping = {} 
     for key_node, value_node in node.value: 
      # keys can be list -> deep 
      key = self.construct_object(key_node, deep=True) 
      # lists are not hashable, but tuples are 
      if not isinstance(key, collections.Hashable): 
       if isinstance(key, list): 
        key = tuple(key) 

      if sys.version_info.major == 2: 
       try: 
        hash(key) 
       except TypeError as exc: 
        raise ConstructorError(
         "while constructing a mapping", node.start_mark, 
         "found unacceptable key (%s)" % 
         exc, key_node.start_mark) 
      else: 
       if not isinstance(key, collections.Hashable): 
        raise ConstructorError(
         "while constructing a mapping", node.start_mark, 
         "found unhashable key", key_node.start_mark) 

      value = self.construct_object(value_node, deep=deep) 

      # Actually do the check. 
      if key in mapping: 
       raise KeyError("Got duplicate key: {!r}".format(key)) 

      mapping[key] = value 
     return mapping 


class NoDuplicateLoader(Reader, Scanner, Parser, Composer, NoDuplicateConstructor, Resolver): 
    def __init__(self, stream): 
     Reader.__init__(self, stream) 
     Scanner.__init__(self) 
     Parser.__init__(self) 
     Composer.__init__(self) 
     NoDuplicateConstructor.__init__(self) 
     Resolver.__init__(self) 



yaml_str = """\ 
some_key: 0, 
another_key: 
    x: 1 
""" 

data = yaml.load(yaml_str, Loader=NoDuplicateLoader) 
print(data) 
相關問題