2011-10-08 159 views
0

我試圖從/ r/Askreddit獲取線程標題。下面的代碼返回None而不是線程標題。BeautifulSoup問題並從findAll函數打印字符串

from BeautifulSoup import BeautifulSoup 
import urllib2, json 

site='http://www.reddit.com/r/AskReddit/' 

soup=BeautifulSoup(urllib2.urlopen(site)) 

questions=soup.findAll('p',{"class":"title"}) 


for i in questions: 
     print i.string 
     break 

回答

1

標題是在a標籤,而不是p標籤的string屬性。 另外,注意空間title後:

<p class="title"><a class="title " href="http://www.reddit.com/r/AskReddit/comments/l5157/whats_the_best_face_you_can_pull_before_and_after/">What's the best face you can pull? Before and after please.</a> <span class="domain">(<a href="http://www.reddit.com/r/AskReddit/">self.AskReddit</a>)</span></p> 

questions=soup.findAll('a',{"class":"title "}) 

以上通過查看這個HTML片段中