蟒蛇正則表達式來刪除評論

我怎麼會寫一個正則表達式是刪除所有以＃開頭，並在該行的末尾停止所有評論 - 但同時排除了前兩行這不能不蟒蛇正則表達式來刪除評論

#!/usr/bin/python

和

#-*- coding: utf-8 -*-

來源

2011-08-11 captainandcoke

評論別不要拖慢你的代碼。你爲什麼要刪除它們？ – agf

你不:)。至少，不是一個簡單的正則表達式。考慮以下內容：'s ='not＃a＃comment！'或者這個：s =「」「\ n foo＃\ n bar」「」（其中'\ n'是實際的換行符） –

@ agf，讓下一個人更難處理代碼！ – bgw

您可以通過解析Python代碼tokenize.generate_tokens來刪除註釋。以下是this example from the docs略加修改：

import tokenize 
import io 

def nocomment(s): 
    result = [] 
    g = tokenize.generate_tokens(io.BytesIO(s).readline) 
    for toknum, tokval, _, _, _ in g: 
     # print(toknum,tokval) 
     if toknum != tokenize.COMMENT: 
      result.append((toknum, tokval)) 
    return tokenize.untokenize(result) 

with open('script.py','r') as f: 
    content=f.read() 

print(nocomment(content))

例如：

如果script.py包含

def foo(): # Remove this comment 
    ''' But do not remove this #1 docstring 
    ''' 
    # Another comment 
    pass

然後nocomment輸出是

def foo(): 
    ''' But do not remove this #1 docstring 
    ''' 

    pass

來源

2011-08-11 20:37:46 unutbu

我只是好奇：它如何處理像額外空白的東西？ – bgw

@PiPeep：有關tokenize如何處理空白的示例，請參見[reindent.py]（http://svn.python.org/projects/python/trunk/Tools/scripts/reindent.py）。 – unutbu

其實我並不認爲這完全可以利用一個正則表達式的表達來完成，因爲你需要算報價，以確保#一個實例是不是一個字符串的內部。

我想看看python's built-in code parsing modules尋求類似的幫助。

來源

2011-08-11 20:01:58 bgw

sed -e '1,2p' -e '/^\s*#/d' infile

然後將其包裝在subprocess.Popen調用中。

但是，這不替代真正的解析器！爲什麼這會引起人們的興趣？那麼，假設這個Python腳本：

output = """ 
This is 
#1 of 100"""

繁榮，任何非解析解決方案立即打破你的腳本。

來源

2011-08-11 20:02:23 Boldewyn

爲什麼不在示例中使用python're'包，而是比要求平臺相關的工具？ – bgw

蟒蛇正則表達式來刪除評論

回答

相關問題