（蟒蛇正則表達式）如何捕獲標籤

-2

蟒蛇2.7.6（蟒蛇正則表達式）如何捕獲標籤

樣本文檔

<div id="memo_img"> 
     <table style="table-layout: fixed; width: 100%"> 
     <tbody> 
      <tr> 
       <td>This is just simple sentence 
       </td> 
      </tr> 
     </tbody> 
     </table> 
    </div>

這個網站有很多空白之間的字符串。

我想捕捉只是「這只是簡單的一句話」

我正則表達式

<table style="table-layout: fixed; width: 100%"><tbody><tr><td>(.*)</td>

不工作。

如何忽略空格和製表符？

請幫我

來源

2016-06-10 nontoxice

你爲什麼要用'regex'代替html使用'beautifulsoup'。 – shivsn

我的環境只能使用默認庫 – nontoxice

https://docs.python.org/2/library/htmlparser.html – Kendas

您可以用正則表達式接近它太，我做了串多一點混亂，所以你可以看到它是如何工作的困難模式：

import re 
a = ''' 
    <table style="table-layout: fixed; width: 100%"><tbody><tr><td> 

            This is just simple sentence 
word 
       other   word 
number 
         22 14  </td></tr></tbody></table> 
            </div> 
''' 
m = re.search('<td>((.|\n)*?)<\/td>', a) 
str = m.group(1) 
print ' '.join(str.split())

結果be：這只是簡單的句子單詞其他字數22 14

來源

2016-06-10 10:08:46

（蟒蛇正則表達式）如何捕獲標籤

回答

相關問題