打印使用python

特定單詞後所有單詞

假設我有已下列數據的文件：打印使用python

<td class="w"><a href="show.cgi?id=120012" title="[Title] &#64;Blue: Session_TIMEOUT after 60033 ms">[Title] &#64;Blue: Session_TIMEOUT after 60033 ms</a></td>' 
<td class="w"><a href="show.cgi?id=120012" title="[Title] &#64;Blue: Session_TIMEOUT after 60500 ms">[Title] &#64;Blue: Session_TIMEOUT after 60033 ms</a></td>'

在該上面的字符串我怎樣才能retrive字符串標題後=「[標題] @藍色： Session_TIMEOUT在60033 ms之後「，對於HTML標籤下的兩行，並在下一行寫回字符串。

我想輸出是這樣的：

<td class="w"><a href="show.cgi?id=120012" title="[Title] &#64;Blue: Session_TIMEOUT after 60033 ms">[Title] &#64;Blue: Session_TIMEOUT after 60033 ms</a></td>' 
&#64;Blue: Session_TIMEOUT after 60033 ms 
<td class="w"><a href="show.cgi?id=120012" title="[Title] &#64;Blue: Session_TIMEOUT after 60500 ms">[Title] &#64;Blue: Session_TIMEOUT after 60033 ms</a></td>' 
&#64;Blue: Session_TIMEOUT after 60500 ms

請幫我一樣的.... 預先感謝

來源

2012-11-28 Surya Gupta

可以使用regulare表達。如果你可以告訴你的intereset的字符串時，始終title="和結束ms之間，也就是說，掛靠那麼你可以做：

進口RE＃regulare表達模塊 G = re.compile（'標題=「（*。？MS）「）。搜索（線）＃搜索您的字符串

然後你的字符串將可通過g.group(1)。您可能會發現usefule閱讀有關Python文檔中的正則表達式，這是一個非常重要的編程工具對於每種語言，特別是在腳本中。

您可能還想添加regex標記爲您的問題。

來源

2012-11-28 09:20:10

提到這是真正使用充滿感激。但我還有一個問題，你再看看我的修改問題，在頁面的開始。 –

使用Beautiful Soup庫，你可以做到這一點很容易：

from BeautifulSoup import BeautifulSoup 
myHTML = '<td class="w"><a href="show.cgi?id=120012" title="[Title] &#64;Blue: Session_TIMEOUT after 60033 ms">[Title] &#64;BlueScreen: RCU_PCPU_TIMEOUT after 60033 ms</a></td>' 
html_doc = BeautifulSoup(myHTML) 
print html_doc.td.a.string

美麗的湯可以使用pip或easy_install，或者apt-get安裝如果你是一個基於Debian的操作系統，如您想：

pip install BeautifulSoup 
easy_install BeautifulSoup 
apt-get install python-beautifulsoup

來源

2012-11-28 09:28:02

一個簡單的方法：

line = line[(line.index('[Title]')+len('[Title]')):] 
line = line[(line.index('[Title]')+len('[Title]')):] 
text = line[:line.index('</a></td>')] 
print line + '\n' + text

雖然，一個更好的方式去了解這將是使用正則表達式由CodeChordsman

來源

2012-11-28 09:38:39 asheeshr

我改變了我的問題，請看看並回復給我。 –

打印使用python

回答

相關問題