的Python：太多請求

-3

我已經做了Python程序，它解析子reddits頁面，使他們的名單。但問題是每當我嘗試運行這個程序，reddit服務器總是給我錯誤：429, 'too many requests'。的Python：太多請求

我怎麼能打倒的請求的數量，這樣，我就沒有速率限制？

from bs4 import BeautifulSoup as bs 
from time import sleep 
import requests as req 

html = req.get('http://www.reddit.com/') 
print html 
soup = bs(html.text) 

# http://www.reddit.com/subreddits/ 
link_to_sub_reddits = soup.find('a',id='sr-more-link')['href'] 

print link_to_sub_reddits 

L=[] 

for navigate_the_pages in xrange(1): 

     res = req.get(link_to_sub_reddits) 

     soup = bs(res.text) 
     # soup created 
     print soup.text 

     div = soup.body.find('div', class_=lambda(class_):class_ and class_=='content') 
     div = div.find('div', id= lambda(id):id and id=='siteTable') 

     cnt=0 

     for iterator in div: 

      div_thing = div.contents[cnt] 

      if not div_thing=='' and div_thing.name=='div' and 'thing' in div_thing['class']: 

       div_entry = div_thing.find('a',class_=lambda(class_):class_ and 'entry' in class_) 
       # div with class='entry......' 

       link = div_entry.find('a')['href'] 
       # link of the subreddit 
       name_of_sub = link.split('/')[-2] 
       # http://www.reddit.com/subreddits/ 
       # ['http:', '', 'www.reddit.com', 'subreddits', ''] 

       description = div_entry.find('strong').text 
       # something about the community 

       p_tagline = div_entry.find('p',class_='tagline') 
       subscribers = p_tagline.find('span',class_='number').text 

       L.append((name_of_sub, link, description, subscribers)) 

      elif not div_thing=='' and div_thing.name=='div' and 'nav-buttons' in div_thing['class']: 
       # case when we find 'nav' button 

       link_to_sub_reddits = div_thing.find('a')['href'] 
       break 

      cnt = cnt + 1 
      sleep(10) 

     sleep(10)

編輯：所有的人downvoting，我不知道我已經發布了這個問題（給予反饋）做出什麼嚴重的錯誤。如果有幫助，我是3天大的'Pythoner'。所以基本上我正在學習Python。可能是我提出的問題對於你們來說太明顯了，但它不適合我。這個問題可以幫助像我一樣嘗試學習Python的其他noob。但是感謝downvotes它會在某處丟失。

來源

2015-06-09 paramvir

簡單 - 減少請求。 – That1Guy

傢伙，爲什麼所有的downvotes ???。我是python的新手。我也很清楚，「提出幾個要求」。但我想知道如何控制對服務器發出的請求數量。 – paramvir

https://docs.python.org/2/library/time.html#time.sleep – That1Guy

一個可能的原因可以是書籤交易可能已被檢查的用戶代理報頭。由於您沒有添加任何用戶代理標題，因此reddit會將其標記爲bot的請求，這就是您收到錯誤的原因。嘗試向請求添加用戶代理。

來源

2015-06-11 03:48:39 randomguy

這是正常的rate limiting是reddit的一樣。您唯一的選擇是做出較少數量的請求，或者使用不同IP的多個服務器發出請求（在這種情況下，您的方法會根據服務器的數量進行擴展）。

從維基百科的描述爲HTTP error code 429：

429 Too Many Requests (RFC 6585):

The user has sent too many requests in a given amount of time. Intended for use with rate limiting schemes.

來源

2015-06-09 15:33:23

首先嚐試找出你被允許多久發送請求，並將它與你發送請求的最大速率。

當您發現您經常發出請求的時候，請在每個請求之間添加一些簡單的信息，例如time.sleep(interval)，以確保您在它們之間等待足夠的時間。

如果你要聰明一點，你可以寫東西的時間有多久了，因爲你的最後的請求，還是算你有多少在最近的時間內做出。然後，您可以使用這些信息來決定睡多久。

編輯：其實看着規則頁面：https://github.com/reddit/reddit/wiki/API#rules

Monitor the following response headers to ensure that you're not exceeding the limits: 
    X-Ratelimit-Used: Approximate number of requests used in this period 
    X-Ratelimit-Remaining: Approximate number of requests left to use 
    X-Ratelimit-Reset: Approximate number of seconds to end of period 
Clients connecting via OAuth2 may make up to 60 requests per minute.

看來，他們告訴你的迴應，你可以有多少要求提出的，多久你必須等待，直到你得到更多的。當您沒有剩餘的使用請求時，休眠直到分鐘結束。這背後

來源

2015-06-09 15:40:41 camz

的Python：太多請求

回答

相關問題