2015-08-13 31 views
2

我在Python中處理一些json格式的日誌文件。編碼一些條件查詢非常簡單,例如將Python字符串解釋爲條件語句?

line=[1,'runtime',{'elapsed':12.3,'jobname':'high38853'}] # read from json 

# split the record and see what jobs take over 30 seconds 
key,category,details=line 
if category == 'runtime' and details['elapsed'] > 30: 
    print details 

有沒有辦法來安全解釋的字符串作爲條件表達式的一部分,這樣我可以接受在命令行上一個條件,使之我查詢的一部分?

search 'details["elapsed"] > 30' 

這樣在代碼中我可以做這樣的事情?

if *something involving sys.argv[1]*: 
    print line 
+1

「安全」你的意思是「沒有'eval'」?如果是這樣,看到這個問題:http://stackoverflow.com/questions/3513292/python-make-eval-safe –

+0

你會沒事與更'有限'的語法?例如。 'search'過去了> 30'',或者一般來說,'搜索'屬性conditional_operator值''。 – Cyphase

+0

另外,您需要檢查靜態還是動態的值?意思是說,你們事先都知道它們嗎? – Cyphase

回答

3

這應該做你想要什麼:

from __future__ import print_function 

import ast 
import operator 
import sys 

OPERATORS = { 
    '<': operator.lt, 
    '<=': operator.le, 
    '>': operator.gt, 
    '>=': operator.ge, 
    '==': operator.eq, 
    '!=': operator.ne, 
    # 'in' is using a lambda because of the opposite operator order 
    # 'in': (lambda item, container: operator.contains(container, item), 
    'in': (lambda item, container: item in container), 
    'contains': operator.contains, 
    } 


def process_conditionals(conditional_strings, variables): 
    for conditional_string in conditional_strings: 
     # Everything after first and op is part of second 
     first, op, second = conditional_string.split(None, 2) 

     resolved_operands = [] 
     for raw_operand in (first, second): 
      try: 
       resolved_operand = ast.literal_eval(raw_operand) 
      except ValueError: # If the operand is not a valid literal 
       ve = sys.exc_info() 
       try: 
        # Check if the operand is a known value 
        resolved_operand = variables[raw_operand] 
       except KeyError: # If the operand is not a known value 
        # Re-raise the ValueError 
        raise ve[1], None, ve[2] 

      resolved_operands.append(resolved_operand) 

     yield (op, tuple(resolved_operands)) 


def main(lines, *conditional_strings): 
    for line in lines: 
     key, category, details = line 

     variables = { 
      'key': key, 
      'category': category, 
      'elapsed': details['elapsed'], 
      'jobname': details['jobname'], 
      } 

     conditionals = process_conditionals(conditional_strings, variables) 

     try: 
      # You could check each conditional separately to determine 
      # which ones have errors. 
      condition = all(OPERATORS[op](*operands) 
          for op, operands in conditionals) 
     except TypeError: 
      print("A literal in one of your conditionals is the wrong type. " 
        "If you can't see it, try running each one separately.", 
        file=sys.stderr) 
      break 
     except ValueError: 
      print("An operand in one of your conditionals is neither a known " 
        "variable nor a valid literal. If you can't see it, try " 
        "running each one separately.", file=sys.stderr) 
      break 
     else: 
      if condition: 
       print(line) 


if __name__ == '__main__': 
    lines = [ 
     [1, 'runtime', {'elapsed': 12.3, 'jobname': 'high38853'}], 
     [2, 'runtime', {'elapsed': 45.6, 'jobname': 'high38854'}], 
     [3, 'runtime', {'elapsed': 78.9, 'jobname': 'high38855'}], 
     [4, 'runtime', {'elapsed': 14.7, 'jobname': 'high38856'}], 
     [5, 'runtime', {'elapsed': 25.8, 'jobname': 'high38857'}], 
     [6, 'runtime', {'elapsed': 36.9, 'jobname': 'high38858'}], 
     [7, 'runtime', {'elapsed': 75.3, 'jobname': 'high38859'}], 
     ] 

    conditional_strings = sys.argv[1:] 

    main(lines, *conditional_strings) 

例子:

$ ./SO_31999444.py 'elapsed > 30' 
[2, 'runtime', {'jobname': 'high38854', 'elapsed': 45.6}] 
[3, 'runtime', {'jobname': 'high38855', 'elapsed': 78.9}] 
[6, 'runtime', {'jobname': 'high38858', 'elapsed': 36.9}] 
[7, 'runtime', {'jobname': 'high38859', 'elapsed': 75.3}] 


$ ./SO_31999444.py 'elapsed > 20' 'elapsed < 50' 
[2, 'runtime', {'jobname': 'high38854', 'elapsed': 45.6}] 
[5, 'runtime', {'jobname': 'high38857', 'elapsed': 25.8}] 
[6, 'runtime', {'jobname': 'high38858', 'elapsed': 36.9}] 


$ ./SO_31999444.py 'elapsed > 20' 'elapsed < 50' 'key >= 5' 
[5, 'runtime', {'jobname': 'high38857', 'elapsed': 25.8}] 
[6, 'runtime', {'jobname': 'high38858', 'elapsed': 36.9}] 


$ ./SO_31999444.py "'9' in jobname" 
[7, 'runtime', {'jobname': 'high38859', 'elapsed': 75.3}] 


$ ./SO_31999444.py "jobname contains '9'" 
[7, 'runtime', {'jobname': 'high38859', 'elapsed': 75.3}] 


$ ./SO_31999444.py "jobname in ['high38857', 'high38858']" 
[5, 'runtime', {'jobname': 'high38857', 'elapsed': 25.8}] 
[6, 'runtime', {'jobname': 'high38858', 'elapsed': 36.9}] 


$ ./SO_31999444.py "9 in jobname" 
A literal in one of your conditionals is the wrong type. If you can't see it, try running each one separately. 


$ ./SO_31999444.py "notakey == 'something'" 
An operand in one of your conditionals is neither a known variable nor a valid literal. If you can't see it, try running each one separately. 


$ ./SO_31999444.py "2 == 2" 
[1, 'runtime', {'jobname': 'high38853', 'elapsed': 12.3}] 
[2, 'runtime', {'jobname': 'high38854', 'elapsed': 45.6}] 
[3, 'runtime', {'jobname': 'high38855', 'elapsed': 78.9}] 
[4, 'runtime', {'jobname': 'high38856', 'elapsed': 14.7}] 
[5, 'runtime', {'jobname': 'high38857', 'elapsed': 25.8}] 
[6, 'runtime', {'jobname': 'high38858', 'elapsed': 36.9}] 
[7, 'runtime', {'jobname': 'high38859', 'elapsed': 75.3}] 


$ ./SO_31999444.py 
[1, 'runtime', {'jobname': 'high38853', 'elapsed': 12.3}] 
[2, 'runtime', {'jobname': 'high38854', 'elapsed': 45.6}] 
[3, 'runtime', {'jobname': 'high38855', 'elapsed': 78.9}] 
[4, 'runtime', {'jobname': 'high38856', 'elapsed': 14.7}] 
[5, 'runtime', {'jobname': 'high38857', 'elapsed': 25.8}] 
[6, 'runtime', {'jobname': 'high38858', 'elapsed': 36.9}] 
[7, 'runtime', {'jobname': 'high38859', 'elapsed': 75.3}] 

這是一個有趣的小項目:)。

+0

事實上,一個有趣的項目來研究...很好! –

+0

@MarkHarrison,謝謝:)。讓我知道你是否有任何問題。 – Cyphase

0

確實沒有一種安全的方法可以做到這一點。對於基本條件,您可以解析特定格式的輸入字符串。如果輸入格式爲「var> 5」,則可以這樣解析:

var, op, num = argv[1].split() 
var = getattr(sys.modules[__name__], var) # Get a reference to the data 
num = int(num) 
if op == ">": 
    r = var > num 
elif op == "<": 
    r = var < num 
... 

if r: 
    <do stuff> 

要支持更復雜的語句,您需要改進解析器。如果你不相信你的輸入,你應該把getattr和int換成try/except塊。要支持int或float或另一個var,你需要相當多的邏輯。