0
我目前有問題解析表中發生的所有tr標記,我能夠解析第一個tr標記,但我無法理解如何解析所有後續的tr標籤,我想過使用for循環,但它沒有工作。 我只包含了部分代碼,其中包含我想要存儲在json文件中的tr標籤。如何解析python中的多個tr標記
這裏是我的嘗試:
def parseFacultyPage(br, facultyID):
if br is None:
return None
br.open('https://academics.vit.ac.in/student/stud_home.asp')
response = br.open('https://academics.vit.ac.in/student/class_message_view.asp?sem=' + facultyID)
html = response.read()
soup = BeautifulSoup(html)
tables = soup.findAll('table')
# Extracting basic information of the faculty
infoTable = tables[0].findAll('tr')
name = infoTable[2].findAll('td')[0].text
if (len(name) is 0):
return None
subject = infoTable[2].findAll('td')[1].text
msg = infoTable[2].findAll('td')[2].text
sent = infoTable[2].findAll('td')[3].text
emailmsg = 'Subject: New VIT Email' + msg
這裏是HTML代碼示例如果tr標籤存在不止一個。
<table width="79%" border="0" cellpadding="0" cellspacing="0" height="350">
<tr>
<td valign="top" width="1%" bgcolor=#FFFFFF>
</td>
<td valign="top" width="78%" bgcolor=#FFFFFF>
<center><b><u>VIEW CLASS MESSAGE - Winter Semester 2015~16</u></b></center>
<br><br>
<br>
<table cellpadding=4 cellspacing=2 border=0 bordercolor='black' width="100%">
<tr bgcolor=#5A768D>
<td width="25%"><font color=#FFFFFF>From</font></td>
<td width="25%"><font color=#FFFFFF>Course</font></td>
<td><font color=#FFFFFF>Message</font></td>
<td width="10%"><font color=#FFFFFF>Posted On</font></td>
</tr>
<tr bgcolor="#EDEADE" onMouseOut="this.bgColor='#EDEADE'" onMouseOver="this.bgColor='#FFF9EA'">
<td valign="top">RAGHAVAN R (SITE)</td>
<td valign="top">ITE308 - Distributed Systems - TH</td>
<td valign="top">Dear students,
As informed in the class, this is to remind you Today special class from 6 to 6.50 pm at same venue SJT 126.
regards
R. Raghavan
SITE</td>
<td valign="top">11/02/2016 11:42:57</td>
</tr>
<tr bgcolor="#EDEADE" onMouseOut="this.bgColor='#EDEADE'" onMouseOver="this.bgColor='#FFF9EA'">
<td valign="top">SMART (APT) (ACAD)</td>
<td valign="top">STS302 - Soft Skills - SS</td>
<td valign="top">Dear Students,
As 04 Feb 16 to 08 Feb 16 were announced as 「No Instruction days」, the first assessment that was supposed to happen from 08 Feb 16 to 12 Feb 16 is being postponed to 7th week (15 Feb 16 to 19 Feb 16)
</td>
<td valign="top">10/02/2016 21:48:14</td>
</tr>
<tr bgcolor=#5A768D>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
</table>
<br><br>
</td>
</tr>
</table>
你的答案是正確的,但我無法得到它的工作的html頁面我只包括部分HTML代碼,以便您的回答是不工作的這一點,它只是正確存儲消息可以請您查看html代碼並告訴我如何正確定位它? –
請檢查'tables [1]'是否讓你成爲內部表格。用一些解釋更新答案 – Obsidian
非常感謝! –