2010-10-31 46 views
1

我收到此錯誤:的Python /機械化 - 無法選擇的形式 - ParseError(EXC)

>>> br = Browser() 
>>> br.open("http://www.bestforumz.com/forum/") 
<response_seek_wrapper at 0x21f9fd0 
whose wrapped object = 
<closeable_response at 0x21f9558 whose 
fp = <socket._fileobject object at 
0x021F5F30>>> 
>>> br.select_form(nr=0) 

Traceback (most recent call last): 
    File "<pyshell#3>", line 1, in <module> 
    br.select_form(nr=0) 
    File "build\bdist.win32\egg\mechanize\_mechanize.py", line 505, in select_form 
    global_form = self._factory.global_form 
    File "build\bdist.win32\egg\mechanize\_html.py", line 546, in __getattr__ 
    self.forms() 
    File "build\bdist.win32\egg\mechanize\_html.py", line 559, in forms 
    self._forms_factory.forms()) 
    File "build\bdist.win32\egg\mechanize\_html.py", line 228, in forms 
    raise ParseError(exc) 
ParseError: <unprintable ParseError object> 

請HEP我出去

感謝

回答

1

機械化不能保證所有解析HTML。你可能需要手工完成(這不是太難,這是Python)。

您是否正在查詢網站的search.php頁面?你可以爲此使用urllib2。

import urllib2 
import urllib 

values = dict(foo="hello", bar="world") # examine form for actual vars 
try: 
    req = urllib2.Request("http://example.com/search.php", 
          urllib.urlencode(values)) 
    response_page = urllib2.urlopen(req).read() 
except urllib2.HTTPError, details: 
    pass #do something with the error here... 
3

我告訴你,這是一些祕密我一直用於解析HTML(我們的目標是使力通過機械化解析HTML)

br = mechanize.Browser(factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True))