我試圖用這個代碼來檢索歷史氣象數據:Python的請求返回不同的數據
url = 'https://www.wunderground.com/history/airport/KDCA/2017/05/07/DailyHistory.html'
querystring = {'format': '1'}
headers = {'cache-control': 'no-cache',
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8"}
response = requests.get(url, headers=headers, params=querystring)
print(response.text)
我回來從請求如下:
TimeEDT,TemperatureF,Dew PointF,Humidity,Sea Level PressureIn,VisibilityMPH,Wind Direction,Wind SpeedMPH,Gust SpeedMPH,PrecipitationIn,Events,Conditions,WindDirDegrees,DateUTC<br />
12:52 AM,50.0,43.0,77,29.63,10.0,WSW,6.9,-,N/A,,Partly Cloudy,240,2017-05-07 04:52:00<br />
1:52 AM,51.1,42.1,71,29.64,10.0,WSW,10.4,-,N/A,,Scattered Clouds,250,2017-05-07 05:52:00<br />
2:52 AM,50.0,41.0,71,29.65,10.0,WSW,10.4,-,N/A,,Partly Cloudy,240,2017-05-07 06:52:00<br />
但是,如果我用的是在我的瀏覽器(Safari)相同的網址我得到這個:
TimeEDT,TemperatureF,Dew PointF,Humidity,Sea Level PressureIn,VisibilityMPH,Wind Direction,Wind SpeedMPH,Gust SpeedMPH,PrecipitationIn,Events,Conditions,FullMetar,WindDirDegrees,DateUTC
12:52 AM,50.0,43.0,77,29.63,10.0,WSW,6.9,-,N/A,,Partly Cloudy,METAR KDCA 070452Z 24006KT 10SM FEW050 10/06 A2963 RMK AO2 SLP034 T01000061 401830100,240,2017-05-07 04:52:00
1:52 AM,51.1,42.1,71,29.64,10.0,WSW,10.4,-,N/A,,Scattered Clouds,METAR KDCA 070552Z 25009KT 10SM SCT080 11/06 A2964 RMK AO2 SLP037 T01060056 10128 20100 53012,250,2017-05-07 05:52:00
2:52 AM,50.0,41.0,71,29.65,10.0,WSW,10.4,-,N/A,,Partly Cloudy,METAR KDCA 070652Z 24009KT 10SM FEW050 10/05 A2965 RMK AO2 SLP040 T01000050,240,2017-05-07 06:52:00
注意「FullMetar」列在Safari中返回,但在請求輸出中缺失。 (有趣的是,Chrome也省略了「FullMetar」列)。
我想使用python檢索數據,包括「FullMetar」列。
(這是一個沒有身份驗證,CSS,JavaScript等,這通常似乎是問題的基礎上,我讀過其他SO問題,一個很簡單的頁面。)
這似乎是與該頁面是如何處理用戶代理或標題的問題。與python或請求無關。 – Alvaro
在頁面底部(在我的瀏覽器中)有以下鏈接[顯示完整的METARS](http://www.wunderground.com/cgi-bin/findweather/getForecast?setpref=SHOWMETAR&value=1)。您可以設置一個會話並首先獲取該URI,然後在第二步中獲得實際數據。看起來您已經在瀏覽器中使用了第一個URL(並且可能存儲了相應的Cookie)。請參閱[請求'文檔](http://docs.python-requests.org/en/master/user/advanced/)。 –
我在想這可能與cookie相關,但我不確定從哪裏開始尋找。我想我發現了這個問題,所以我會在下面發佈答案。 –