2017-08-16 48 views
0

我需要獲取YAML文件中某些鍵的行號。解析YAML,即使在有序圖中也能得到行號

請注意,this answer不能解決問題:我確實使用ruamel.yaml,答案不適用於有序地圖。

#!/usr/bin/env python3 
# -*- coding: utf-8 -*- 

from ruamel import yaml 

data = yaml.round_trip_load(""" 
key1: !!omap 
    - key2: item2 
    - key3: item3 
    - key4: !!omap 
    - key5: item5 
    - key6: item6 
""") 

print(data) 

結果我得到這個:

CommentedMap([('key1', CommentedOrderedMap([('key2', 'item2'), ('key3', 'item3'), ('key4', CommentedOrderedMap([('key5', 'item5'), ('key6', 'item6')]))]))]) 

什麼不允許訪問的行號,除了!!omap鍵:

print(data['key1'].lc.line) # output: 1 
print(data['key1']['key4'].lc.line) # output: 4 

但:

print(data['key1']['key2'].lc.line) # output: AttributeError: 'str' object has no attribute 'lc' 

事實上,data['key1']['key2]str

我已經找到了解決辦法:

#!/usr/bin/env python3 
# -*- coding: utf-8 -*- 

from ruamel import yaml 

DATA = yaml.round_trip_load(""" 
key1: !!omap 
    - key2: item2 
    - key3: item3 
    - key4: !!omap 
    - key5: item5 
    - key6: item6 
""") 


def get_line_nb(data): 
    if isinstance(data, dict): 
     offset = data.lc.line 
     for i, key in enumerate(data): 
      if isinstance(data[key], dict): 
       get_line_nb(data[key]) 
      else: 
       print('{}|{} found in line {}\n' 
         .format(key, data[key], offset + i + 1)) 


get_line_nb(DATA) 

輸出:

key2|item2 found in line 2 

key3|item3 found in line 3 

key5|item5 found in line 5 

key6|item6 found in line 6 

但是這看起來有點 「髒」。有沒有更正確的方法呢?

編輯:此變通辦法不僅髒,但只適用於簡單的情況下,像上面的一個,並且將盡快有嵌套列表的方式給出錯誤的結果

回答

1

這個問題不在於你是使用!omap,它不會像「正常」映射那樣給出行號。這應該清楚,你從print(data['key1']['key4'].lc.line)(其中key4是外部!omap中的關鍵)獲得4。

由於this答案表明,

您可以訪問藏品

data['key1']['key4']的值是一個集合項目(另一個!omap)財產LC,但data['key1']['key2']值不一個集合項目,但一個內置的python字符串,它沒有插槽來存儲lc屬性。

爲了得到一個.lc屬性上非集合像你這樣子類RoundTripConstructor,使用類似的類在scalarstring.py(與__slots__調整接受lc屬性,然後轉移在現有線路信息的字符串該屬性節點,然後設置行,列信息:

import sys 
import ruamel.yaml 

yaml_str = """ 
key1: !!omap 
    - key2: item2 
    - key3: item3 
    - key4: !!omap 
    - key5: 'item5' 
    - key6: | 
     item6 
""" 

class Str(ruamel.yaml.scalarstring.ScalarString): 
    __slots__ = ('lc') 

    style = "" 

    def __new__(cls, value): 
     return ruamel.yaml.scalarstring.ScalarString.__new__(cls, value) 

class MyPreservedScalarString(ruamel.yaml.scalarstring.PreservedScalarString): 
    __slots__ = ('lc') 

class MyDoubleQuotedScalarString(ruamel.yaml.scalarstring.DoubleQuotedScalarString): 
    __slots__ = ('lc') 

class MySingleQuotedScalarString(ruamel.yaml.scalarstring.SingleQuotedScalarString): 
    __slots__ = ('lc') 

class MyConstructor(ruamel.yaml.constructor.RoundTripConstructor): 
    def construct_scalar(self, node): 
     # type: (Any) -> Any 
     if not isinstance(node, ruamel.yaml.nodes.ScalarNode): 
      raise ruamel.yaml.constructor.ConstructorError(
       None, None, 
       "expected a scalar node, but found %s" % node.id, 
       node.start_mark) 

     if node.style == '|' and isinstance(node.value, ruamel.yaml.compat.text_type): 
      ret_val = MyPreservedScalarString(node.value) 
     elif bool(self._preserve_quotes) and isinstance(node.value, ruamel.yaml.compat.text_type): 
      if node.style == "'": 
       ret_val = MySingleQuotedScalarString(node.value) 
      elif node.style == '"': 
       ret_val = MyDoubleQuotedScalarString(node.value) 
      else: 
       ret_val = Str(node.value) 
     else: 
      ret_val = Str(node.value) 
     ret_val.lc = ruamel.yaml.comments.LineCol() 
     ret_val.lc.line = node.start_mark.line 
     ret_val.lc.col = node.start_mark.column 
     return ret_val 


yaml = ruamel.yaml.YAML() 
yaml.Constructor = MyConstructor 

data = yaml.load(yaml_str) 
print(data['key1']['key4'].lc.line) 
print(data['key1']['key2'].lc.line) 
print(data['key1']['key4']['key6'].lc.line) 

請注意,以print最後一次通話的輸出爲6,爲文本串標與|開始

如果您還想轉儲data,則需要知道這些My....類型的Representer

+0

好的,所以,它看起來非常複雜,也許我會堅持我的解決方法。謝謝! – zezollo

+0

我更新了獲取行號的答案,我將把傾銷(如有必要)留給你,以及做數字,布爾和其他標量。 – Anthon

+0

優秀!這比我的解決方法好得多,只要有其他嵌套列表的方式,就不能正常工作。 – zezollo