美麗的湯find_all p不工作維基解析器

使用某人的維基百科解析器，我shaw我想抓住維基百科上的頁面歷史的p標記的下一個元素。美麗的湯find_all p不工作維基解析器

import sys 
import urllib 
import urllib2 
import re 
from bs4 import BeautifulSoup 
article = sys.argv[1] 
while article!="Philosophy" and count<MAX_HOPS: 
    articleURL = urllib.quote(article) 
    #print "Article URL: %s" %(articleURL) 
    opener = urllib2.build_opener() 
    opener.addheaders = [('User-agent', 'Mozilla/5.0')] 
    resource = opener.open("http://en.wikipedia.org/wiki/" + articleURL) 
    data = resource.read() 
    resource.close() 
    soup = BeautifulSoup(data) 
    articleTemp = "" 
    #('div',id="bodyContent").find_all('p') 
    for soupContent in soup.find_all('p')

我得到一個語法錯誤，當我在湯叫find_all('p')然而

soupContent = soup.find('div',id="bodyContent").p

給我的第一款問題是我需要它也經歷到第二段。

來源

2014-02-25 PGDJ

這是你的代碼實際上看起來像什麼？爲什麼您的for循環中沒有冒號（'：'）或語句？ – Totem

如果我刪除'while'循環（因爲我不認爲'count'或'MAX_HOPS'是在上面的代碼中定義的）並且unindent該部分並在'for'循環中打印'soupContent'，那麼它可以正常工作我。 Totem說，你的語法錯誤僅僅是因爲你沒有正確的for循環嗎？ – elParaguayo

正如圖騰所示，代碼中的for循環未正確形成。我沒有看到find_all方法本身的問題。例如，下面的代碼對我來說運行良好：

import sys 
import urllib 
import urllib2 
import re 
from bs4 import BeautifulSoup 
article = "Stack_Overflow_(website)" 

articleURL = urllib.quote(article) 
#print "Article URL: %s" %(articleURL) 
opener = urllib2.build_opener() 
opener.addheaders = [('User-agent', 'Mozilla/5.0')] 
resource = opener.open("http://en.wikipedia.org/wiki/" + articleURL) 
data = resource.read() 
resource.close() 
soup = BeautifulSoup(data) 
articleTemp = "" 
#('div',id="bodyContent").find_all('p') 
for soupContent in soup.find_all('p'): 
    print soupContent.text

來源

2014-02-25 23:13:42 elParaguayo

美麗的湯find_all p不工作維基解析器

回答

相關問題