我已經做了Python程序,它解析子reddits頁面,使他們的名單。但問題是每當我嘗試運行這個程序,reddit服務器總是給我錯誤:429, 'too many requests'
。的Python:太多請求
我怎麼能打倒的請求的數量,這樣,我就沒有速率限制?
from bs4 import BeautifulSoup as bs
from time import sleep
import requests as req
html = req.get('http://www.reddit.com/')
print html
soup = bs(html.text)
# http://www.reddit.com/subreddits/
link_to_sub_reddits = soup.find('a',id='sr-more-link')['href']
print link_to_sub_reddits
L=[]
for navigate_the_pages in xrange(1):
res = req.get(link_to_sub_reddits)
soup = bs(res.text)
# soup created
print soup.text
div = soup.body.find('div', class_=lambda(class_):class_ and class_=='content')
div = div.find('div', id= lambda(id):id and id=='siteTable')
cnt=0
for iterator in div:
div_thing = div.contents[cnt]
if not div_thing=='' and div_thing.name=='div' and 'thing' in div_thing['class']:
div_entry = div_thing.find('a',class_=lambda(class_):class_ and 'entry' in class_)
# div with class='entry......'
link = div_entry.find('a')['href']
# link of the subreddit
name_of_sub = link.split('/')[-2]
# http://www.reddit.com/subreddits/
# ['http:', '', 'www.reddit.com', 'subreddits', '']
description = div_entry.find('strong').text
# something about the community
p_tagline = div_entry.find('p',class_='tagline')
subscribers = p_tagline.find('span',class_='number').text
L.append((name_of_sub, link, description, subscribers))
elif not div_thing=='' and div_thing.name=='div' and 'nav-buttons' in div_thing['class']:
# case when we find 'nav' button
link_to_sub_reddits = div_thing.find('a')['href']
break
cnt = cnt + 1
sleep(10)
sleep(10)
編輯:所有的人downvoting,我不知道我已經發布了這個問題(給予反饋)做出什麼嚴重的錯誤。如果有幫助,我是3天大的'Pythoner'。所以基本上我正在學習Python。可能是我提出的問題對於你們來說太明顯了,但它不適合我。這個問題可以幫助像我一樣嘗試學習Python的其他noob。但是感謝downvotes它會在某處丟失。
簡單 - 減少請求。 – That1Guy
傢伙,爲什麼所有的downvotes ???。我是python的新手。我也很清楚,「提出幾個要求」。但我想知道如何控制對服務器發出的請求數量。 – paramvir
https://docs.python.org/2/library/time.html#time.sleep – That1Guy