0
有四個關鍵詞:title
,blog
,tags
,state
多餘的關鍵字出現正在從各自的比賽中刪除。Python的正則表達式子混亂
示例:
blog: blog state title tags and
回報state title tags and
代替
blog state title tags and
的sub
功能應該匹配.+
它看到blog:
後,所以我不知道爲什麼它把blog
作爲例外.+
正則表達式:
re.sub(r'((^|\n|\s|\b)(title|blog|tags|state)(\:\s).+(\n|$))', matcher, a)
代碼:
def n15():
import re
a = """blog: blog: fooblog
state: private
title: this is atitle bun
and text"""
kwargs = {}
def matcher(string):
v = string.group(1).replace(string.group(2), '').replace(string.group(3), '').replace(string.group(4), '').replace(string.group(5), '')
if string.group(3) == 'title':
kwargs['title'] = v
elif string.group(3) == 'blog':
kwargs['blog_url'] = v
elif string.group(3) == 'tags':
kwargs['comma_separated_tags'] = v
elif string.group(3) == 'state':
kwargs['post_state'] = v
return ''
a = re.sub(r'((^|\n|\s|\b)(title|blog|tags|state)(\:\s).+(\n|$))', matcher, a)
a = a.replace('\n', '<br />')
a = a.replace('\r', '')
a = a.replace('"', r'\"')
a = '<p>' + a + '</p>'
kwargs['body'] = a
print kwargs
輸出:
{'body': '<p>and text</p>', 'post_state': 'private', 'blog_url': 'foo', 'title': 'this is a bun'}
編輯:
所需的輸出:
{'body': '<p>and text</p>', 'post_state': 'private', 'blog_url': 'fooblog', 'title': 'this is atitle bun'}
它應該在每行中查找多行或重複。 – Braj
您的預期產出是多少? – hwnd