我有以下代碼:蟒蛇的urllib2 stucks 5秒後6 GET,並且每篇文章之後卡住2秒
def whatever(url, data=None):
req = urllib2.Request(url)
res = urllib2.urlopen(req, data)
html = res.read()
res.close()
我嘗試用它來得到這樣:
for i in range(1,20):
whatever(someurl)
然後,經過前6次表現正確,然後阻塞5秒,並繼續正常工作,休息取得:
2012-06-29 15:20:22,487: Clear [127.0.0.1:49967]:
2012-06-29 15:20:22,507: Clear [127.0.0.1:49967]:
2012-06-29 15:20:22,528: Clear [127.0.0.1:49967]:
2012-06-29 15:20:22,552: Clear [127.0.0.1:49967]:
2012-06-29 15:20:22,569: Clear [127.0.0.1:49967]:
2012-06-29 15:20:22,592: Clear [127.0.0.1:49967]:
**2012-06-29 15:20:26,486: Clear [127.0.0.1:49967]:**
2012-06-29 15:20:26,515: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,555: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,586: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,608: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,638: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,655: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,680: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,700: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,717: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,753: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,770: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,789: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,809: Clear [127.0.0.1:49967]:
2012-06-29 15:20:26,828: Clear [127.0.0.1:49967]:
如果使用POST(with data={'a':'b'})
,然後每個請求卡住2秒。我試過urllib2
和pycurl
,他們都給出了相同的結果。
任何人對這種被we behavior的行爲有什麼想法?
這確實有幫助,但不是問題的根源..我發現最終的原因是pycurl不是線程安全的..切換到urllib2,並使用多線程解決了我的問題。 – pinkdawn