2014-07-13 89 views
0

我想抓這個頁面: http://photo.net/nikon-camera-forum/00aoms 我在Python中使用Requests Package但是雖然頁面沒問題,但是當我在瀏覽器中輸入url時它會加載我得到這個錯誤是requests.get.text的輸出,我不知道是什麼問題:在python中刮網頁時出錯

"photo.net Temporarily Unavailable 
photo.net 
Sun Jul 13 19:26:33 EDT 2014 — photo.net is down temporarily for 
system maintenance. Please visit us again later." 
+0

顯示您的代碼... – alfasin

回答

2

該網站有一個簡單的User-Agent頭檢查,provide it

>>> import requests 
>>> response = requests.get('http://photo.net/nikon-camera-forum/00aoms', headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4)'}) 
>>> print response.text 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 
<html xmlns:fb="http://www.facebook.com/2008/fbml" xmlns:og="http://opengraphprotocol.org/schema/"> 
<head> 
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 
<script type="text/javascript">var _sf_startpt=(new Date()).getTime()</script> 

<title>D800 wifi options? - Photo.net Nikon Forum</title> 
... 

僅供參考,有什麼WA s沒有通過標題:

>>> response = requests.get('http://photo.net/nikon-camera-forum/00aoms') 
>>> print response.text 
<html><head><title>photo.net Temporarily Unavailable</title></head> 
<center><h2>photo.net </h2> 
<p><i>Sun Jul 13 19:46:33 EDT 2014</i>&nbsp;&mdash; photo.net is down temporarily for 
system maintenance. Please visit us again later. 
</center> 
</body> 
</html> 
+0

非常感謝。 – user3821329