首先找出所有的匹配,然後從列表中單獨刪除它們。 firstFindtheMatching
方法首先發現使用re
庫相匹配的名字:
def firstFindtheMatching(listoffiles):
"""
:listoffiles: list is the name of the files to check if they match a format
:final_string: any file that doesn't match the format 01-01-17.pdf (MM-DD-YY.pdf) is put in one str type output. (ALSO) I'm returning the listoffiles so in that you can see the whole output in one place but you really won't need that.
"""
import re
matchednames = re.findall("\d{1,2}-\d{1,2}-\d{1,2}\.pdf", listoffiles)
#connect all output in one string for simpler handling using sets
final_string = ' '.join(matchednames)
return(final_string, listoffiles)
這裏是輸出:
('05-08-17.pdf 04-08-17.pdf 08-09-16.pdf', '05-08-17.pdf Test.pdf 04-08-17.pdf 08-09-16.pdf 08-09-2016.pdf some-all-letters.pdf')
set(['08-09-2016.pdf', 'some-all-letters.pdf', 'Test.pdf'])
我用下面的主,如果你想重新生成的結果。這樣做的好處是您可以爲firstFindtheMatching()
添加更多正則表達式。它可以幫助你保持獨立。
def main():
filenames= "05-08-17.pdf Test.pdf 04-08-17.pdf 08-09-16.pdf 08-09-2016.pdf some-all-letters.pdf"
[matchednames , alllist] = firstFindtheMatching(filenames)
print(matchednames, alllist)
notcommon = set(filenames.split()) - set(matchednames.split())
print(notcommon)
if __name__ == '__main__':
main()
如果您的輸入字符串是'05-17-17.pdf Test.pdf 05-48-2017.pdf 03-14-17.pdf',那麼您希望輸出的字符串是什麼? – Ajax1234
我希望的輸出是'Test.pdf 05-48-2017.pdf'。它應該找到第二個日期,因爲它被寫爲2017年而不是17年。 – JakeIC
@JakelC請看我最近的編輯。 – Ajax1234