字符串(\ S):只有一個匹配使用後返回| *
Person1(has(1, 1) has(2, 2)
has(3, 3)
had(4, 4) had(5, 5))
Person2(has(6, 6) had(7, 7))
我想has()
選擇所有內容爲Person1
,即['1, 1', '2, 2', '3, 3']
。
我嘗試過:has\((\d, \d)\)(.|\s)*Person2
與全局模式標誌,但只返回1, 1
。
字符串(\ S):只有一個匹配使用後返回| *
Person1(has(1, 1) has(2, 2)
has(3, 3)
had(4, 4) had(5, 5))
Person2(has(6, 6) had(7, 7))
我想has()
選擇所有內容爲Person1
,即['1, 1', '2, 2', '3, 3']
。
我嘗試過:has\((\d, \d)\)(.|\s)*Person2
與全局模式標誌,但只返回1, 1
。
使用re.findall()
功能的解決方案:
import re
s = '''
Person1(has(1, 1) has(2, 2)
has(3, 3)
had(4, 4) had(5, 5))
Person2(has(6, 6) had(7, 7))'''
has_items = re.findall(r'(?<!Person2\()has\(([^()]+)\)', s)
print(has_items)
輸出:
['1, 1', '2, 2', '3, 3']
(?<!Person2\()
- 回顧後負斷言,確保了關鍵has
子串不前面帶有Person2(
([^()]+)
- 所述第一捕獲含has
項
組到grep has
項一定Person
使用下面統一方法具有延長例如:
def grepPersonItems(s, person):
person_items = []
person_group = re.search(r'(' + person + '\(.*?\)\))', s, re.DOTALL)
if person_group:
person_items = re.findall(r'has\(([^()]+)\)', person_group.group())
return person_items
s = '''
Person1(has(1, 1) has(2, 2)
has(3, 3)
had(4, 4) had(5, 5))
Person2(has(6, 6) had(7, 7), has(8,8)) Person3(has(2, 6) had(7, 7), has(9, 9))'''
print('Person1: ', person1_items)
print('Person2: ', person2_items)
print('Person3: ', person3_items)
print(person1_items)
print(person2_items)
print(person3_items)
輸出T:
Person1: ['1, 1', '2, 2', '3, 3']
Person2: ['6, 6', '8, 8']
Person3: ['2, 6', '9, 9']
爲什麼不完全分析,然後你可以拿起你的任何可能需要的 - 你需要兩個模式,一個抓住每個人,它的內容,另搶在其中個人部分+您可以添加更多解析來獲取單個元素並將其轉換爲本機Python類型。例如:
import collections
import re
persons = re.compile(r"(Person\d+)\(((?:.*?\(.*?\)\s*)+)\)")
contents = re.compile(r"(\w+)\((.*?)\)")
def parse_input(data, parse_inner=True, map_inner=str):
result = {} # store for our parsed data
for match in persons.finditer(data): # loop through our `Persons`
person = match.group(1) # grab the first group to get our Person
elements = collections.defaultdict(list) # store for the parsed inner elements
for element in contents.finditer(match.group(2)): # loop through the has/had/etc.
element_name = element.group(1) # the first group holds the name
element_data = element.group(2) # this is the inner content of each has/had/etc.
if parse_inner: # if we want to parse the inner elements...
element_data = [map_inner(x.strip()) for x in element_data.split(",")]
elements[element_name].append(element_data) # add our inner results
result[person] = elements # add persons to our result
return result # well, obvious...
然後,您可以解析所有內容並將其存取到您心中的內容。最簡單的例子是:
test = """Person1(has(1, 1) has(2, 2)
has(3, 3)
had(4, 4) had(5, 5))
Person2(has(6, 6) had(7, 7))"""
parsed = parse_input(test, False) # basic string grab
print(parsed["Person1"]["has"]) # ['1, 1', '2, 2', '3, 3']
print(parsed["Person2"]["has"]) # ['6, 6']
print(parsed["Person2"]["had"]) # ['7, 7']
但你可以做這麼多......你可以有多個添加的人和有它「轉換」成實際的Python結構:
test = """Person1(has(1, 1) has(2, 2)
has(3, 3)
had(4, 4) had(5, 5))
Person2(has(6, 6) had(7, 7))
Person3(has(1, 2) has(3, 4) has(4, 5) foo(6, 7))"""
parsed = parse_input(test, True, int) # parses everything and auto-converts to int
print(parsed["Person3"]["has"]) # [[1, 2], [3, 4], [4, 5]]
print(parsed["Person3"]["has"][1]) # [3, 4]
print(sum(parsed["Person3"]["foo"][0])) # 13
print(parsed["Person1"]["has"][1] + parsed["Person2"]["has"][0]) # [2, 2, 6, 6]
# etc.
我想你可能會嘗試這種方法,我認爲這對所有人來說都是動態和簡單的。它分割並解析字符串,並在Person的字典中推送每個需要的數組。
樣品來源(run here):
import re
regex = r"has\(\s*(\d+)\s*,\s*(\d+)\s*\)"
dict={}
test_str = ("Person1(has(1, 1) has(2, 2)\n"
" has(3, 3) \n"
" had(4, 4) had(5, 5))\n"
"Person2(had(6, 6) has(7, 7))\n"
"Person3(had(6, 6) has(8, 8))")
res=re.split(r"(Person\d+)",test_str)
currentKey="";
for rs in res:
if "Person" in rs:
currentKey=rs;
elif currentKey !="":
matches = re.finditer(regex, rs, re.DOTALL)
ar=[]
for match in matches:
ar.append(match.group(1)+","+match.group(2))
dict[currentKey]=ar;
print(dict)
輸出爲:
{'Person1': ['1,1', '2,2', '3,3'], 'Person2': ['7,7'], 'Person3': ['8,8']}
我只能選擇'Person1'或''Person2' has'?如果'Person2'有1個以上'has',那麼在第一個之後的那個也會被選中。謝謝。 – Harrison
@哈里森,詳細說明你的問題,你想grep'Person1'和'Person2'的所有'has'項目或任何可能的人嗎? – RomanPerekhrest
我的原始問題是爲'Person1'獲得所有'has',但如果你還可以爲'Person2'提供另一個正則表達式,那將會很好。我在問題中簡化了'Person2',它也可以有多個'has'和多行。 – Harrison