2015-09-09 65 views
0

我想在Twitter上搜索50個關鍵字。到目前爲止,我已經嘗試了兩種搜索Twitter的方法。第一種方法無論關鍵字如何,都只打印最後一條推文的數據。如何搜索大量推文

這是第一種方法我用:

for (i in c("#GMCR","#NFLX","#PCLN","#SWN","#MA","#EW","#WDC", "#ROST", "#RHT", "#ESRX", "#URBN", "#CRM", "#THC", "#BLK", "#AMZN", "#AAPL", "#CERN", "#FFIV", "#DTV", "#AZO", "#ISRG", "#SJM", "#EOG", "#OXY", "#CF", "#GIS", "#FLS", "#WMT", "#NTAP", "#HSP", "#CSX", "#ACT", "#MOS", "#TJX", "#CL", "#MCD", "#COG", "#RRC", "#FLIR", "#CTSH", "#MYL", "#LEG", "#APH", "#VAR", "#HAS", "#FSLR", "#APA", "#ABC", "#UNP", "#EL")) 
     { 
      tweet6<-searchTwitter(i,lang='en',since='2015-09-02', until='2015-09-03') 
     } 
tweet6 

這是第二種方法,我只是把50個關鍵字在searchTwitter()功能,但它給一個403錯誤:

tweet6<-searchTwitter('#GMCR||#NFLX||#PCLN||#SWN||#MA||#EW||#WDC||#ROST||#RHT||#ESRX||#URBN||#CRM||#THC||#BLK||#AMZN||#AAPL||#CERN||#FFIV||#DTV||#AZO||#ISRG||#SJM||#EOG||#OXY||#CF||#GIS||#FLS||#WMT||#NTAP||#HSP||#CSX||#ACT||#MOS||#TJX||#CL||#MCD||#COG||#RRC||#FLIR||#CTSH||#MYL||#LEG||#APH||#VAR||#HAS||#FSLR||#APA||#ABC||#UNP||#EL', 
n=500,lang='en,since='2015-09-02', until='2015-09-03') 

這將返回:

Error in twInterfaceObj$doAPICall(cmd, params, "GET", ...) : 
    client error: (403) Forbidden 
+0

是不是股票的hastags,叫'cashtags'所以'#GMCR'會是'$ GMCR'? @DougwenKuei – Rime

回答

0

您應該閱讀twitter API documentation。根據搜索的最佳實踐:

Limit your searches to 10 keywords and operators.

Queries can be limited due to complexity. If this happens the Search API will respond with the error: {"error":"Sorry, your query is too complex. Please reduce complexity and try again."}

因此,您可以一次搜索最多10個。你不應該在搜索中使用的管道,搜索複式關鍵字的正確方法就是一個空間:

searchTwitter('#GMCR #NFLX #PCLN ...) 

我覺得你的循環是要對這個最好的方法,只要確保你不;噸命中限速:

The GET search/tweets is part of the Twitter REST API 1.1 and is rate limited similarly to other v1.1 methods. See REST API Rate Limiting in v1.1 for information on that model. At this time, users represented by access tokens can make 180 requests/queries per 15 minutes. Using application-only auth, an application can make 450 queries/requests per 15 minutes on its own behalf without a user context.