2012-11-06 140 views
0

我想一個文本文件中的Python的正則表達式匹配

In [44]: with open(path) as f: 
    ....:  for line in f: 
    ....:   matched = re.search('^PARTITION BY HASH',line) 
    ....:   if matched is not None: 
    ....:    print matched.group() 
    ....: 

該文件包含像 分割線BY HASH(SOME_THING)相匹配; 還有一些其他行之間有 SUBPARTITION BY HASH(SOME_THING)不應該匹配

比賽結束後,我想刪除該行。 但打印matched.group失敗,爲什麼?

+8

爲什麼're'在這裏? 「if」PARTITION BY HASH「in line」或者如果line.startswith(「PARTITION BY HASH」):' –

+0

更新我的問題,爲什麼我應該使用正則表達式 –

回答

1

是這樣的:

In [29]: strs1="PARTITION BY HASH(SOME_THING)" 

In [30]: strs2="SUBPARTITION BY HASH(SOME_THING)" 

In [31]: bool(re.match(r"^PARTITION BY HASH",strs1)) 
Out[31]: True 

In [32]: bool(re.match(r"^PARTITION BY HASH",strs2)) 
Out[32]: False 
0

但打印matched.group失敗

那麼它根本什麼是應該做的事:它返回的比賽。自從

>>> import re 
>>> line = "PARTITION BY HASH(something)" 
>>> re.search('^PARTITION BY HASH', line).group() 
'PARTITION BY HASH' 

如果你想打印基於什麼阿什維尼·喬杜裏認爲,與'PARTITION BY HASH'啓動線,這種情況下:

with open(path) as f: 
    for line in f: 
     if line.startswith('PARTITION BY HASH'): 
      print line, 

請注意逗號,以防止打印從插入附加最終行字符。

如果你堅持使用包re

import re 

with open(path) as f: 
    for line in f: 
     if re.match('PARTITION BY HASH', line): 
      print line, 

請注意,re.match沒有起始位置指示器^使用(見http://docs.python.org/2/library/re.html#search-vs-match瞭解更多信息)