用美麗的湯拉從多個<tr>的

目標文本輸出當然名字的字典和他們的等級從這個：用美麗的湯拉從多個<tr>的

<tr> 
<td class="course"><a href="/courses/1292/grades/5610">Modern Europe &amp; the World - Dewey</a></td> 
<td class="percent"> 
    92% 
</td> 
<td style="display: none;"><a href="#" title="Send a Message to the Teacher" class="no-hover"><img alt="Email" src="/images/email.png?1395938788" /></a></td> 
</tr>

這樣：

{Modern Europe &amp; the World - Dewey: 92%, the next couse name: grade...etc}

我知道如何只是找到百分比標籤或只是一個href標籤，但我不確定如何獲取文本並將其編譯到字典中，因此它更加實用。謝謝！

來源

2014-09-29 A A Ron

由於每個tr包含一系列包含所需信息的td元素，您只需使用find_all()將它們收集到列表中，然後提取所需的信息：

from bs4 import BeautifulSoup 

soup = BeautifulSoup(""" 
<tr> 
<td class="course"><a href="/courses/1292/grades/5610">Modern Europe &amp; the World - Dewey</a></td> 
<td class="percent"> 
    92% 
</td> 
<td style="display: none;"><a href="#" title="Send a Message to the Teacher" class="no-hover"><img alt="Email" src="/images/email.png?1395938788" /></a></td> 
</tr> 
""") 

grades = {} 

for tr in soup.find_all("tr"): 
    td_text = [td.text.strip() for td in tr.find_all("td")] 
    grades[td_text[0]] = td_text[1]

結果：

>>> grades 
{u'Modern Europe & the World - Dewey': u'92%'}

來源

2014-09-29 06:06:18

試試這個：
對於每個tr元素，試圖找到孩子，你需要什麼（那些誰擁有course和percent類）
如果同時存在，則建立grades字典

>>> from bs4 import BeautifulSoup 
>>> html = """ 
... <tr> 
... <td class="course"><a href="/courses/1292/grades/5610">Modern Europe &amp; the World - Dewey</a></td> 
... <td class="percent"> 
...  92% 
... </td> 
... <td style="display: none;"><a href="#" title="Send a Message to the Teacher" class="no-hover"><img alt="Email" src="/images/email.png?1395938788" /></a></td> 
... </tr> 
... """ 
>>> 
>>> soup = BeautifulSoup(html) 
>>> grades = {} 
>>> for tr in soup.find_all('tr'): 
...  td_course = tr.find("td", {"class" : "course"}) 
...  td_percent = tr.find("td", {"class" : "percent"}) 
...  if td_course and td_percent: 
...   grades[td_course.text.strip()] = td_percent.text.strip() 
... 
>>> 
>>> grades 
{u'Modern Europe & the World - Dewey': u'92%'}

來源

2014-09-29 06:05:36 xecgr

用美麗的湯拉從多個<tr>的

回答

相關問題