2016-06-26 83 views
2

這是我的第一篇文章。 當談到代碼時,我總是來這個論壇尋找答案。特定正則表達式搜索python

我一直在理解Python中的正則表達式,但它有點難。

我的文字,看起來像這樣:

Name: Clash1 
Distance: -1.341m 
Image Location: Test 1_navis_files\cd000001.jpg 
HardStatus: New 
Clash Point: 3.884m, -2.474m, 2.659m 
Date Created: 2016/6/2422:45:09 

Item 1 
GUID: 6efaec51-b699-4d5a-b947-505a69c31d52 
Path: File ->Colisiones_v2015.dwfx ->Segment ->Pipes (1) ->Pipe Types (1) ->Default (1) ->Pipe Types [2463] ->Shell 
Item Name: Pipe Types [2463] 
Item Type: Shell 

Item 2 
GUID: 6efaec51-b699-4d5a-b947-505a69c31dea 
Path: File ->Colisiones_v2015.dwfx ->Segment ->Walls (4) ->Basic Wall (4) ->Wall 1 (4) ->Basic Wall [2343] ->Shell 
Item Name: Basic Wall [2343] 
Item Type: Shell 

------------------ 


Name: Clash2 
Distance: -1.341m 
Image Location: Test 1_navis_files\cd000002.jpg 
HardStatus: New 
Clash Point: 3.884m, 3.533m, 2.659m 
Date Created: 2016/6/2422:45:09 

Item 1 
GUID: 6efaec51-b699-4d5a-b947-505a69c31d52 
Path: File ->Colisiones_v2015.dwfx ->Segment ->Pipes (1) ->Pipe Types (1) ->Default (1) ->Pipe Types [2463] ->Shell 
Item Name: Pipe Types [2463] 
Item Type: Shell 

Item 2 
GUID: 6efaec51-b699-4d5a-b947-505a69c31de8 
Path: File ->Colisiones_v2015.dwfx ->Segment ->Walls (4) ->Basic Wall (4) ->Wall 1 (4) ->Basic Wall [2341] ->Shell 
Item Name: Basic Wall [2341] 
Item Type: Shell 

------------------ 

我需要做的就是創建一個提取文本下面的東西作爲一個字符串(由-------------------------------分離)的每個塊的列表:衝突名稱和衝突點。

例如:Clash 1 3.884, 3.533, 2.659

我真的很新的Python,和真的沒有正則表達式太多的瞭解。

任何人都可以給我一些關於使用正則表達式從文本中提取這些值的線索嗎?

我做了這樣的事情:

exp = r'(?<=Clash Point\s)(?<=Point\s)([0-9]*)' 
match = re.findall(exp, html) 

if match: 
    OUT.append(match) 
else: 
    OUT = 'fail' 

,但我知道我離我的目標。

回答

0
import re 


lines = s.split('\n') 

names = [] 
points = [] 

for line in lines:  
    result = re.search('^Name:\s*(\w+)', line) 
    if result: 
     names.append(result.group(1)) 

    result = re.search('^Clash Point:\s*([-0-9m., ]+)',line) 
    if result: 
     points.append(result.group(1)) 

print(names) 
print(points) 

# if you need more nice output, you can use zip() function 
for name, point in zip(names, points): 
    print(name, point) 

您可以在regexr.com找到有關正則表達式的有用信息。另外,我使用它進行快速測試和參考。

1

如果你正在尋找一個正則表達式的解決方案,你能想出:

^Name:\s*   # look for Name:, followed by whitespaces 
        # at the beginning of a line 
(?P<name>.+)  # capture the rest of the line 
        # in a group called "name" 
[\s\S]+?   # anything afterwards lazily 
^Clash\ Point:\s* # same construct as above 
(?P<point>.+)  # same as the other group 

a demo on regex101.com


翻譯成 Python代碼,這將是:

import re 
rx = re.compile(r""" 
       ^Name:\s* 
       (?P<name>.+) 
       [\s\S]+? 
       ^Clash\ Point:\s* 
       (?P<point>.+)""", re.VERBOSE|re.MULTILINE) 

for match in rx.finditer(your_string_here): 
    print match.group('name') 
    print match.group('point') 

這將輸出:

Clash1 
3.884m, -2.474m, 2.659m 
Clash2 
3.884m, 3.533m, 2.659m 

a working demo on ideone.com