2013-02-26 35 views
0

我試圖抓取一個網站(http://www.dataescolabrasil.inep.gov.br/dataEscolaBrasil/home.seam)使用mechanize但我得到一個我不明白的錯誤(和因此解決不了)。這可能是由於我對網絡開發的瞭解不足。這個錯誤是什麼意思:ValueError:未知的POST表單編碼類型''(以及如何解決它?)

這裏就是我想要做的事:

import mechanize 

# this is the website I want to crawl 
LINK = "http://www.dataescolabrasil.inep.gov.br/dataEscolaBrasil/home.seam" 

br = mechanize.Browser() 
br.open(LINK) 
request = mechanize.Request(LINK) 
response = mechanize.urlopen(request) 

# there're two forms in the page (output ommited), I want the second one. 
forms = mechanize.ParseResponse(response, backwards_compat=False) 
for form in br.forms(): 
    print "Form name:", form.name 
    print form 

br.select_form(nr=1) 
br.form['codEntidadeDecorate:codEntidadeInput'] = '11024968' 
response2 = br.submit() 

而這裏的運行時錯誤我得到:

Traceback (most recent call last): 
    File "C:\test.py", line 19, in <module> 
    response2 = br.submit() 
    File "build\bdist.win32\egg\mechanize\_mechanize.py", line 541, in submit 
    File "build\bdist.win32\egg\mechanize\_mechanize.py", line 530, in click 
    File "build\bdist.win32\egg\mechanize\_form.py", line 2999, in click 
    File "build\bdist.win32\egg\mechanize\_form.py", line 3201, in _click 
    File "build\bdist.win32\egg\mechanize\_form.py", line 2350, in _click 
    File "build\bdist.win32\egg\mechanize\_form.py", line 3269, in _switch_click 
    File "build\bdist.win32\egg\mechanize\_form.py", line 3257, in _request_data 
ValueError: unknown POST form encoding type '' 

我已經嘗試了一些調整,以對編碼字符串我的傳遞給表單,試圖理解GET v。POST,但沒有成功。

回答

0

我發現,在形式頁面從你的例子:

<form id="buscaForm" name="buscaForm" method="post" action="/dataEscolaBrasil/home.seam;jsessionid=EFB3D6270E69EAE71733137219C3026B" enctype=""> 

我認爲這是一個問題,空ENCTYPE屬性。您需要將此屬性的值設置爲application/x-www-form-urlencoded或將其刪除以使用默認值。

+0

謝謝,很好的發現。不知道如何改變樹。我用'BeautifulSoup'嘗試過,但沒有運氣 - 有什麼想法?當然'br.form [「enctype」] ='application/x-www-form-urlencoded''不會這樣做,因爲這是一個屬性,而不是控件。 – djas 2013-02-26 19:20:03

+0

嘗試'br.form.enctype =「application/x-www-form-urlencoded」'。 Enctype是一個公共屬性。 – 2013-02-26 19:43:21

相關問題