匹配模式上的多行並返回佔位符

我覺得很難概括我的問題，所以我將從一個示例開始。我有一個文本每行中它必須驗證以下模式：匹配模式上的多行並返回佔位符

{new_field} is {func} of {field}[,{field}]

凡is和of固定條款，{new_field}和{field}是可變的條件，需要以某種方式返回，[和]之間的事情是可選。我需要這個返回一個字典列表，每個字段都包含從textarea中每行提取的可變字詞。

因此，舉例來說，如果我有以下輸入：

name is concat of first_name, last_name 
price is sum of product, taxes, shipping

我需要的輸出：

[{'new_field': 'name', 'func': 'concat', 'fields': ['first_name', 'last_name']}, 
{'new_field': 'price', 'func': 'sum', 'fields': ['product', 'taxes', 'shipping']}]

現在，我想split婷整條生產線和索引的使用以匹配條款，但如果我需要定製佔位符的樣子，我將很難做到這一點。然後，我想使用正則表達式，但遺憾的是，我不知道如何開始/使用re模塊。任何幫助和提示將不勝感激！

來源

2014-04-10 linkyndy

喜歡的東西：

s = """name is concat of first_name, last_name 
price is sum of product, taxes, shipping""" 

out = [] 

for line in s.splitlines(): 
    new_field,func,fields = re.match(r'(\w+) is (\w+) of (.*)',line).groups() 
    out.append({'new_field':new_field, 
       'func':func, 
       'fields':fields.split(',')})

輸出：

out 
Out[20]: 
[{'fields': ['first_name', ' last_name'], 
    'func': 'concat', 
    'new_field': 'name'}, 
{'fields': ['product', ' taxes', ' shipping'], 
    'func': 'sum', 
    'new_field': 'price'}]

請注意，我用上面的，這是演示代碼不錯，但如果你希望穩健性也不是很大很簡潔。至少要檢查match is not None是否可能在fields上做一些更復雜的解析，以確保它與您指定的語法相匹配。一拉

for line in s.splitlines(): 
    match = re.match(r'(\w+) is (\w+) of (.*)',line) 
    if match: 
     new_field,func,fields = match.groups() 
     out.append({'new_field': new_field, 
        'func': func, 
        'fields': some_processing_func(fields)})

來源

2014-04-10 12:33:47 roippi

簡單的方法是：

import re 

text = ['name is concat of first_name, last_name', 
'price is sum of product, taxes, shipping'] 

pattern = "(\w+)\s+is\s+(\w+)\s+of\s+(\w+)\s?(.*)" 

res = [] 
for line in text: 
    m = re.match(pattern,line)  
    res.append({ 
     'new_field': m.group(1), 
     'func': m.group(2), 
     'fields': [x.strip() for x in m.groups()[-1].split(',') if x] 
     }) 
print res

來源

2014-04-10 12:54:10 bosnjak

匹配模式上的多行並返回佔位符

回答

相關問題