在字符串匹配獨特模式 - Python的

我有一個名爲txtFreeForm字符串列表：在字符串匹配獨特模式 - Python的

['Add roth Sweep non vested money after 5 years of termination', 
'Add roth in-plan to the 401k plan.]

我需要檢查，如果只是「添加羅斯」存在於句子。要做到這一點，我用這個

for each_line in txtFreeForm: 
    match = re.search('add roth',each_line.lower()) 
    if match is not None: 
     print(each_line)

但這顯然我的列表中同時返回的字符串作爲都包含「添加羅斯」。有沒有一種方法可以專門搜索一個句子中的'添加roth'，因爲我有一堆這樣的模式可以在字符串中搜索。

感謝您的幫助！

來源

2017-01-29 dinesh patro

爲什麼不應該返回兩個字符串，如果它們都包含「Add roth」？ –

'如果'加入roth'each_line.lower（）：...'是解決這個問題的更便宜的方法。不需要「重新」。 – DyZ

我同意'in'是一種更便宜的方法。 @AndreiSavin我知道它會返回兩個，如果在文本中找到。但我正在尋找一種方法來區分只包含'add roth'的句子和那些包含'add roth in plan'的句子 –

你可以通過使用字符串的.Length屬性來解決這個問題嗎？我不是一個有經驗的Python程序員，但在這裏是如何，我認爲它應該工作：

for each_line in txtFreeForm: 
    match = re.search('add roth',each_line.lower()) 
    if (match is not None) and (len(txtFreeForm) == len("Add Roth")): 
     print(each_line)

基本上，如果文本字符串中，字符串的長度正好是該字符串的長度「添加羅斯」，那麼它只能包含「添加羅斯」。

我希望這是有幫助的。

編輯：

我誤解你問什麼。您想要打印出包含「添加Roth」的句子，但不包含含有「在計劃中添加Roth」的句子。它是否正確？

這段代碼如何？

for each_line in txtFreeForm: 
    match_AR = re.search('add roth',each_line.lower()) 
    match_ARIP = re.search('add roth in plan',each_line.lower()) 
    if (match_AR is True) and (match_ARIP is None): 
     print(each_line)

這似乎是它應該解決這個問題。您可以通過搜索它們並將它們添加到比較中來排除任何字符串（如「在計劃中」）。

來源

2017-01-29 05:49:55

那麼爲什麼不直接說'each_line.lower（）=='直接添加roth''？ –

@Michael Kemp你的代碼將再次返回列表中的兩個字符串 –

@AndreiSavin你建議的是完全匹配。不要把羅斯作爲一個句子的一部分。 –

你接近:)這給一個鏡頭：

for each_line in txtFreeForm: 
    match = re.search('add roth (?!in[-]plan)',each_line.lower()) 
    if match is not None: 
     print(each_line[match.end():])

編輯： 唉唉我誤解......你有其中不少。這需要一些更具攻擊性的魔法。

import re 
from functools import partial 

txtFreeForm = ['Add roth Sweep non vested money after 5 years of termination', 
       'Add roth in-plan to the 401k plan.'] 


def roths(rows): 
    for row in rows: 
     match = re.search('add roth\s*', row.lower()) 
     if match: 
      yield row, row[match.end():] 

def filter_pattern(pattern): 
    return partial(lazy_filter_out, pattern) 


def lazy_filter(pattern): 
    return partial(lazy_filter, pattern) 


def lazy_filter_out(pattern, rows): 
    for row, rest in rows: 
     if not re.match(pattern, rest): 
      yield row, rest 

def magical_transducer(bad_words, nice_rows): 
    magical_sentences = reduce(lambda x, y: y(x), [roths] + map(filter_pattern, bad_words), nice_rows) 
    for row, _ in magical_sentences: 
     yield row 

def main(): 
    magic = magical_transducer(['in[-]plan'], txtFreeForm) 
    print(list(magic)) 

if __name__ == '__main__': 
    main()

爲了解釋一下發生了什麼事聽到，你提到你有很多這些單詞來處理。您可能比較兩組項目的傳統方法是嵌套for循環。所以，

results = [] 
for word in words: 
    for pattern in patterns: 
     data = do_something(word_pattern) 
     results.append(data) 
for item in data: 
    for thing in item: 
     and so on... 
     and so fourth...

我用了幾個不同的技術來試圖實現了「奉承」的實施，避免嵌套循環。我會盡我所能來形容他們。

**Function compositions** 
# You will often see patterns that look like this: 
x = foo(a) 
y = bar(b) 
z = baz(y) 

# You may also see patterns that look like this: 
z = baz(bar(foo(a))) 

# an alternative way to do this is to use a functional composition 
# the technique works like this: 
z = reduce(lambda x, y: y(x), [foo, bar, baz], a)

來源

2017-01-29 09:51:41

哇。我甚至不理解這些功能。讓我做一些閱讀，然後回到你身邊。在此先感謝 –

哦哇，對不起！我應該在我的帖子中更清楚。我會看看我是否可以提供一些關於這是幹什麼的額外細節。 –

在字符串匹配獨特模式 - Python的

回答

相關問題