Python測試字符串是否與模板值相匹配

我正在嘗試迭代字符串列表，只保留與我指定的命名模板相匹配的字符串。我想接受任何與模板完全匹配的列表條目，除了在變量<SCENARIO>字段中有一個整數。Python測試字符串是否與模板值相匹配

該檢查需要一般。具體來說，字符串結構可能會發生變化，因此不能保證<SCENARIO>總是顯示在字符X處（例如，使用列表解析）。

下面的代碼顯示了一種使用split的方法，但必須有更好的方法來進行此字符串比較。我能在這裏使用正則表達式嗎？

template = 'name_is_here_<SCENARIO>_20131204.txt' 

testList = ['name_is_here_100_20131204.txt',  # should accept 
      'name_is_here_100_20131204.txt.NEW', # should reject 
      'other_name.txt']      # should reject 

acceptList = [] 

for name in testList: 
    print name 
    acceptFlag = True 
    splitTemplate = template.split('_') 
    splitName = name.split('_') 
    # if lengths do not match, name cannot possibly match template 
    if len(splitTemplate) == len(splitName): 
     print zip(splitTemplate, splitName) 
     # compare records in the split 
     for t, n in zip(splitTemplate, splitName): 
      if t!=n and not t=='<SCENARIO>': 
       #reject if any of the "other" fields are not identical 
       #(would also check that '<SCENARIO>' field is numeric - not shown here) 
       print 'reject: ' + name 
       acceptFlag = False 
    else: 
     acceptFlag = False 

    # keep name if it passed checks 
    if acceptFlag == True: 
     acceptList.append(name) 

print acceptList 
# correctly prints --> ['name_is_here_100_20131204.txt']

來源

2013-12-09 Roberto

是的，人們可以在這裏使用正則表達式。你到目前爲止有一個正則表達式嗎？ –

@SimeonVisser - 對不起，還沒有正則表達式。我知道存在正則表達式，但我不熟悉實現細節。我想確保這是一個有價值的方法，然後才能做得太過分。感謝您的確認。 – Roberto

嘗試使用Python中re模塊的正則表達式：

import re 

template = re.compile(r'^name_is_here_(\d+)_20131204.txt$') 

testList = ['name_is_here_100_20131204.txt', #accepted 
      'name_is_here_100_20131204.txt.NEW', #rejected! 
      'name_is_here_aabs2352_20131204.txt', #rejected! 
      'other_name.txt'] #rejected! 

acceptList = [item for item in testList if template.match(item)]

來源

2013-12-09 18:29:46

這看起來只是我想要的。我想唯一的問題是，我希望'template'是可變的，而不需要特別輸入正則表達式的細節。換句話說，我想以我指定的格式輸入模板，並讓代碼自動將其轉換爲正則表達式語句。我相信有一種方法可以將我的模板解析爲您指定的格式 - 接下來我將介紹這一點。感謝您的指導。 – Roberto

我使用concatenation來構建廣義編譯字符串：'feedNameRegex ='^'+ feedName.replace（''，r'（\ d +）'）+'$''。你看到這個方法有什麼問題嗎？ – Roberto

@羅伯託那麼，對我來說這似乎很好，但是你應該在你的背景下進行測試，對於延誤感到抱歉！ –

這應該做的，據我所知，name_is_here僅僅是字母數字字符的佔位符？

import re 
testList = ['name_is_here_100_20131204.txt',  # should accept 
      'name_is_here_100_20131204.txt.NEW', # should reject 
      'other_name.txt', 
      'name_is_44ere_100_20131204.txt', 
      'name_is_here_100_2013120499.txt', 
      'name_is_here_100_something_2013120499.txt', 
      'name_is_here_100_something_20131204.txt'] 


def find(scenario): 
    begin = '[a-z_]+100_' # any combinations of chars and underscores followd by 100 
    end = '_[0-9]{8}.txt$' #exactly eight digits followed by .txt at the end 
    pattern = re.compile("".join([begin,scenario,end])) 
    result = [] 
    for word in testList: 
     if pattern.match(word): 
      result.append(word) 

    return result 

find('something') # returns ['name_is_here_100_something_20131204.txt']

編輯：場景單獨的變量，正則表達式現在只匹配的字符，隨後100，然後scenarion，然後八位隨後的.txt。

來源

2013-12-09 18:49:47

這可能太籠統了。我不認爲這確保了所需變量部分以外的任何地方都有相同的命名。例如'name_is_44ere_100_20131204.txt'會通過。 – Roberto

好吧，所以name_is_here只能是由下劃線分隔的字母。你想在變量中保持什麼部分？？ Name_is_here_100將會是一個常量字符串？場景後的位數將被修復？如果需要，您可以從變量創建正則表達式。 –

編輯我的解決方案，希望它的作品！ –

Python測試字符串是否與模板值相匹配

回答

相關問題