用BeautifulSoup迭代HTML

我想用BeautifulSoup遍歷HTML文件，並找到帶有內容的標籤「Preferred Name」下面是我正在尋找的標籤:(這是我想要搜索的文件的一部分）：用BeautifulSoup迭代HTML

<td nowrap class="label"> 
    Preferred Name 
    <span class="slot_labels"></span> 
    </td>

我試着用這個（文件搜索的是HTML文件的名稱）：

soup = BeautifulSoup(doc) 
tags = soup.fetch('td') 
for tag in tags: 
    if tag.contents[0] == 'Preferred Name': 
     return tag

此代碼不能正常工作，有人可以幫助...？

來源

2013-03-01 Yishen Chen

內容包括空格，那麼試試這個：

soup = BeautifulSoup(doc) 
tags = soup.fetch('td') 
for tag in tags: 
    if tag.contents[0] and tag.contents[0].strip() == 'Preferred Name': 
     return tag

來源

2013-03-01 00:47:35 isedev

它的工作！但是我不得不把「if」放在「try .. except」裏面，因爲一些標籤的內容[0]是NoneType ...謝謝！ – 2013-03-01 00:52:32

因此編輯...但沒有嘗試...除外。 – isedev 2013-03-01 00:54:05

用BeautifulSoup迭代HTML

回答

相關問題