2016-07-10 42 views
1

我不知道這段代碼有什麼問題。我正試圖從99acres.com上獲取數據。我已經通過了帖子參數。這是代碼Python Scrapy:TypeError:to_bytes必須接收一個unicode,str或bytes對象,得到int

from scrapy import Spider 
from scrapy.http import FormRequest 
from scrapy.selector import HtmlXPathSelector 


class aagSpider(Spider): 
    name = "acre" 
    start_urls = ["http://www.99acres.com"] 

    def parse(self, response): 
     frmdata3 = {"Refine_Localities": "Refine Localities", "action": "/do/quicksearch/search", "bedroom_num": "", 
        "budget_max": "", "budget_min": "", "city": 4, 
        "class": "", "fullSelectedSuggestions": "laxmi nagar, delhi east", "isvoicesearch": "N", 
        "keyword": "", 
        "keyword_suggest": "laxmi nagar, delhi east;", 
        "locality_array[]": "233", 
        "locality_array[]": "233", 
        "locality_array[]": "233", 
        "lstAcn": "HP_R", 
        "lstAcnId": "0", 
        "np_search_type": "NL,NP,R2M", 
        "preference": "S", 
        "property_type": "23", 
        "refine_results": "Y", 
        "res_com": "R", 
        "search_location": "HP", 
        "search_type": "QS", 
        "searchform": "1", 
        "selected_tab": "3", 
        "src": "CLUSTER", 
        "strEntityMap": "[{'type':'locality'},{'1':['laxmi nagar, delhi east','CITY_4, LOCALITY_233, PREFERENCE_S, RESCOM_R']}]", 
        "suggestion": "CITY_4, LOCALITY_233, PREFERENCE_S, RESCOM_R", 
        "texttypedtillsuggestion": "laxmi"} 

     yield FormRequest(response.url, callback=self.fourth, formdata=frmdata3) 

    def fourth(self, response): 
     print "11111111111111111111111111111111111111111111111111" 

我試圖讓頁面通過上面的參數後,但要得到這埃羅

Traceback (most recent call last): 
    File "/home/user/.local/lib/python2.7/site-packages/scrapy/utils /defer.py", line 102, in iter_errback 
    yield next(it) 
    File "/home/user/.local/lib/python2.7/site-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output 
    for x in result: 
    File "/home/user/.local/lib/python2.7/site-packages/scrapy/spidermiddlewares/referer.py", line 22, in <genexpr> 
    return (_set_referer(r) for r in result or()) 
    File "/home/user/.local/lib/python2.7/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr> 
    return (r for r in result or() if _filter(r)) 
    File "/home/user/.local/lib/python2.7/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr> 
    return (r for r in result or() if _filter(r)) 
    File "/home/user/tutorial/tutorial/spiders/acre.py", line 37, in parse 
    yield FormRequest(response.url,callback=self.fourth,formdata=frmdata3) 
    File "/home/user/.local/lib/python2.7/site-packages/scrapy/http/request/form.py", line 28, in __init__ 
    querystr = _urlencode(items, self.encoding) 
    File "/home/user/.local/lib/python2.7/site-packages/scrapy/http/request/form.py", line 61, in _urlencode 
    for v in (vs if is_listlike(vs) else [vs])] 
    File "/home/user/.local/lib/python2.7/site-packages/scrapy/utils/python.py", line 117, in to_bytes 
    'object, got %s' % type(text).__name__) 
TypeError: to_bytes must receive a unicode, str or bytes object, got int 
+0

你能提供一個鏈接到你正在試圖抓取的最後一頁嗎? – MetalloyD

+4

可能由於「城市」:4'。嘗試用「城市」:「4」' –

+0

謝謝保羅。它的工作現在。 – user3809411

回答

2

背後這一問題的原因是,在表單數據,從來沒有治療空,布爾(真/假)和數字。始終將其視爲一個字符串。在python中,Null是None,但是在表單數據中,它始終是'null'。對或錯寫爲'true''false'。對於數字,請將其設爲字符串。

相關問題