Pytrends趨勢結果與手動下載數據不相似

我使用pytrends自動從google趨勢下載csv中的數據。我使用的代碼如下。在這種情況下，我正在下載每月谷歌趨勢數據從2008年到現在。Pytrends趨勢結果與手動下載數據不相似

from pytrends.request import TrendReq 
from urllib.parse import unquote 
from dateutil.relativedelta import relativedelta 
import datetime 
import pytrends 

google_username = "[email protected]" 
google_password = "xxxxx" 

search_term = unquote('%2Fm%2F07gyp7') 
google_trend = TrendReq(google_username, google_password, custom_useragent='Pytrends' ) 
google_trend_payload = {'gprop' : 'news' , 'q': search_term} 
trendresult = TrendReq.trend(google_trend_payload, return_type = 'dataframe') 
print(trendresult)

從谷歌網站前5個月，結果從pytrends結果相比：

Date   Pytrends data   Manual csv data 
2008-01  21.0     28.0 
2008-02  16.0     19.0 
2008-03  16.0     21.0 
2008-04  15.0     18.0 
2008-05  22.0     31.0

任何人都知道的原因是什麼？謝謝。

來源

2016-09-23 python novice

我有同樣的問題，所以我不得不在我的項目中手動下載。現在，我已經意識到了原因。這是谷歌的抽樣方法。 Google每天都會返回不同的趨勢系列。想象一下，谷歌每天有10萬臺服務器，每個查詢只能抽取10k臺服務器。因此，爲了獲得一致的系列，您可以花費30（甚至50）次並取平均值。對於數值不太小（可能超過30）的系列，標準偏差約爲5％（可接受）。

手動和gtrend下載之間的區別可能與它們與提取數據方法不同。該gtrend下載類型https://www.google.com/trends/fetchContent ....類型的網址。我現在知道如何處理手動下載，但我知道有另一種方式來提取數據，如https://www.google.com/trends/trendsReport ..。後者返回每週系列的所有內容（非常豐富）。

目前，似乎有配額限制問題。

來源

2016-12-30 11:25:01 DManh

我已經找到了使用Selenium測試框架完成此操作的最有效方法。我還沒有完成這項工作，但基本思想可以在這裏找到http://www.yseam.com/blog/TR.html。由於google改變了頁面上的規範，我們還需要更改鏈接中提供的一些代碼。 – DManh

Pytrends趨勢結果與手動下載數據不相似

回答

相關問題