2012-06-09 38 views
1

更新:哇,你們都是對的!
至於原因,我還不知道,我需要: 「從BeautifulSoup進口BeautifulSoup」 ,並添加行:Python /解析:BeautifulSoup錯誤「module obj is callable」with Mechanize

response = br.submit() 
print type(response) #new line 
raw = br.response().read()#new line 
print type(raw)#new line 
print type(br.response().read())#new line 
cooked = (br.response().read())#new line 
soup = BeautifulSoup(cooked) 

/更新

嗯,BeautifulSoup和我是不承認的結果br.response()。read()。 我已經導入BeautifulSoup ...

#snippet: 
# Select the first (index zero) form 
br.select_form(nr=0) 
br.form.set_all_readonly(False) 
br['__EVENTTARGET'] = list_of_dates[0] 
br['__EVENTARGUMENT'] = 'calMain' 
br['__VIEWSTATE'] = viewstate 
br['__EVENTVALIDATION'] = eventvalidation 

response = br.submit() 
print br.response().read() #*#this prints the html I'm expecting* 

soup = BeautifulSoup(br.response().read()) #*#but this throws 
#TypeError: 'module' object is not callable. 
#Yet if I call soup = BeautifulSoup("http://page.com"), it's cool.* 

selecttable = soup.find('table',{'id':"tblItems"}) 
#/snippet 

...等等

所以我神交我有錯的那種「對象」,而人,什麼樣的「對象」的呢BeautifulSoup想要,你認爲?

乾杯和感謝!

+0

你用什麼導入語句導入美麗的湯? – Trevor

+0

相關:http://stackoverflow.com/questions/3368231/beautifulsouphtml-not-working-saying-cant-call-module – jfs

+0

您的更新腳本再次做br.response.read()兩次。你需要調用'BeautifulSoup(raw)'。你應該只調用一次response.read。另外,你可能不應該調用模塊對象,而是應該在響應對象上調用它。 –

回答

7

使用

from BeautifulSoup import BeautifulSoup 

而不是

import BeautifulSoup 

否則,我認爲你正在做正確的事!

0

只需確認您的導入是這樣的:

from BeautifulSoup import BeautifulSoup 

或BeautifulSoup4

from bs4 import BeautifulSoup 
1

您寫道:

response = br.submit() 
print br.response().read() #*#this prints the html I'm expecting* 

soup = BeautifulSoup(br.response().read()) 

你爲什麼不嘗試:

response = br.submit() 
soup = BeautifulSoup(response.read()) 

我懷疑它與你在br.response()上撥打.read()的事實有關,在我使用機械化的歷史中,我總是將response()保存到變量中,並從那裏調用.read()。我不知道它會起作用,但它不能解釋爲什麼print br.response().read()可以工作,但是可以試試。

另外,BeautifulSoup的HTML解析器可能不喜歡什麼機械化餵養它。你可以嘗試使用a different parser

+0

同意你的觀點+指定一個解析器可能也有幫助。 –

0

您是否嘗試過只讀取一次對象然後保存結果。

例如:

raw = br.response().read() 

soup = BeautifulSoup(raw) 

與文件對象,你可以一次讀取它們,然後要重新打開它們再次讀取。看起來你正在讀他們兩次。你應該做的另一件事是在閱讀前後打印br.response的類型簽名。

爲了便於調試,試印刷類型簽名:

print type(response) # see the type of response from above 
raw = br.response().read() 
print type(raw) 
print type(br.response().read()) # see what happens the second time :P 

此外,如果您發佈的堆棧跟蹤也將有所幫助。