2017-09-14 19 views
0

打開文件並嘗試計算其中變量的總次數。re.findall搜索以模式開始的變量

ATTRS = ['test1', 'test2', 'test3'] 
with open('_file_name', 'r') as fh: 
    contents = fh.read(): 
    for attr in ATTRS: 
     count = len(re.findall(attr, contents)) 
     print count 

該代碼似乎工作正常,以檢查匹配字符串文件中的任何位置。但是,我只想在行的開頭搜索出現次數。

+2

你可以添加文件的樣本以及你希望得到和你目前做了什麼?另外,用語言標記它,它看起來像蟒蛇。 – tima

回答

0

看看這是否適合你。它是一個簡單的python3代碼。

def counter(attrs): 
    with open('dummy.txt', 'r') as f: 
     contents = f.read() 
     for attr in attrs: 
      count = 0 
      rp = re.compile('^\s*' + attr + '\\b') 
      for r in contents.split('\n'): 
       matches = rp.match(r) 
       if matches != None: 
        count += 1 
      print(count) 

考慮一個虛擬文件像這樣的:

hello 67989 hello 
hello 67989 hello hello 67989 hello 
    hello 67989 hello 
    hello 67989 hello 
pssss hello 

測試代碼爲attrs = ['s', 'hello', 'pssss']

In [12]: attrs = ['s', 'hello', 'pssss'] 
In [13]: counter(attrs) 
     0 
     4 
     1 

此代碼考慮包括壓痕線的話的第一道防線。如果您希望嚴格限制在行首,請從正則表達式中刪除\s*

說明:

^ -> Start of string 
\s* -> 0 or more space like characters (including tabs) 
attr -> Dynmaic attribute like `hello` 
\\b -> Word boundary to make sure when you search `hello`, `hellohello` doesn't match