取下開始和結束時或線

我想使用正則表達式，以除去從字符串一些符號，例如僅結束字符：取下開始和結束時或線

==（即無論是在開始時和在一個行的末尾發生），

*（僅在行的開始處）。

def some_func(): 
    clean = re.sub(r'= {2,}', '', clean) #Removes 2 or more occurrences of = at the beg and at the end of a line. 
    clean = re.sub(r'^\* {1,}', '', clean) #Removes 1 or more occurrences of * at the beginning of a line.

我的代碼有什麼問題？看起來表情是錯的。如果字符/符號位於行首或行末（如果有一次或多次出現），我該如何刪除它？

來源

2010-11-06 Gusto

如果您只想從開始和結束中刪除字符，可以使用string.strip()方法。這將使一些這樣的代碼：

>>> s1 = '== foo bar ==' 
>>> s1.strip('=') 
' foo bar ' 
>>> s2 = '* foo bar' 
>>> s2.lstrip('*') 
' foo bar'

的strip方法刪除從一開始就和字符串的結尾給定參數的人物，ltrip從僅僅是個開始刪除它們，和rstrip從只刪除它們結束。使用strip/lstrip/rstrip將是最適合你想要做什麼

clean = re.sub(r'(^={2,})|(={2,}$)', '', clean) 
clean = re.sub(r'^\*+', '', clean)

但恕我直言，：

如果你真的想使用正則表達式，它們會是這個樣子。

編輯：在尼克的建議，這裏要說的是會做這一切在一行的解決方案：

clean = clean.lstrip('*').strip('= ')

（一個常見的錯誤是，認爲這些方法在他們的順序刪除字符事實上，參數只是一系列要移除的字符，不管它們的順序是什麼，這就是爲什麼.strip('= ')會從開始和結束中刪除每個'='和''，而不僅僅是字符串' ='。）

來源

2010-11-06 15:41:03 MatToufoutu

+1正則表達式似乎對此有點矯枉過正。你可能想提供一個'完整'的解決方案：'s.strip（'='）.strip（'*'）。strip（）' – 2010-11-06 15:42:36

我不知道爲什麼，但它不適用於我（（ – Gusto 2010-11-06 16:21:30

@Gusto你會得到什麼，而不是預期的？我只是再次測試它，它適用於我:( – MatToufoutu 2010-11-06 16:29:17

你的正則表達式中有多餘的空格。即使是一個空間也算作一個角色。

r'^(?:\*|==)|==$'

來源

2010-11-06 15:31:17

首先您應該注意「{」...之前的空格，因爲您的示例中的量詞適用於空格。

要刪除「=」（兩個或兩個以上）只在開始或結束還需要一個不同的正則表達式...例如

clean = re.sub(r'^(==+)?(.*?)(==+)?$', r'\2', s)

如果你不把任何「^」或「$ 「表達式可以匹配任何地方（即使在字符串中間）。

來源

2010-11-06 15:38:28 6502

而不是替換，但保持？：

tu = ('======constellation==' , '==constant=====' , 
     '=flower===' , '===bingo=' , 
     '***seashore***' , '*winter*' , 
     '====***conditions=**' , '=***trees====***' , 
     '***=information***=' , '*=informative***==') 

import re 
RE = '((===*)|\**)?(([^=]|=(?!=+\Z))+)' 
pat = re.compile(RE) 

for ch in tu: 
    print ch,' ',pat.match(ch).group(3)

結果：

======constellation== constellation 
==constant===== constant 
=flower=== =flower 
===bingo= bingo= 
***seashore*** seashore*** 
*winter* winter* 
====***conditions=** ***conditions=** 
=***trees====*** =***trees====*** 
***=information***= =information***= 
*=informative***== =informative***

你想其實

==== ***條件= **給條件= **？

*** ====百==== ***給百==== ***？

爲開頭？ * *

來源

2010-11-06 16:47:31 eyquem

中做了一些錯誤的編碼，結果正是我想要的，但我想將它寫入文件（以utf-8編碼）而不是打印。建議？ – Gusto 2010-11-06 17:23:16

我認爲下面的代碼將做的工作：

tu = ('======constellation==' , '==constant=====' , 
     '=flower===' , '===bingo=' , 
     '***seashore***' , '*winter*' , 
     '====***conditions=**' , '=***trees====***' , 
     '***=information***=' , '*=informative***==') 

import re,codecs 

with codecs.open('testu.txt', encoding='utf-8', mode='w') as f: 
    pat = re.compile('(?:==+|\*+)?(.*?)(?:==+)?\Z') 
    xam = max(map(len,tu)) + 3 
    res = '\n'.join(ch.ljust(xam) + pat.match(ch).group(1) 
        for ch in tu) 
    f.write(res) 
    print res

哪裏是我的大腦時，我在以前的帖子中寫道的RE ??！ O！O 非貪婪量詞。*？之前== + \ Z是真正的解決方案。

來源

2010-11-06 21:03:07 eyquem

取下開始和結束時或線

回答

相關問題