-3
嘗試運行下面的代碼時出現縮進錯誤。我試圖遞歸地打印出一組html頁面的URL。需要幫助確定python代碼中的縮進錯誤
import urllib2
from BeautifulSoup import *
from urlparse import urljoin
# Create a list of words to ignore
ignorewords=set(['the','of','to','and','a','in','is','it'])
def crawl(self,pages,depth=2):
for i in range(depth):
newpages=set()
for page in pages:
try:
c=urllib2.urlopen(page)
except:
print "Could not open %s" % page
continue
soup=BeautifulSoup(c.read())
self.addtoindex(page,soup)
links=soup('a')
for link in links:
if ('href' in dict(link.attrs)):
url=urljoin(page,link['href'])
if url.find("'")!=-1: continue
url=url.split('#')[0] # remove location portion
if url[0:4]=='http' and not self.isindexed(url):
newpages.add(url)
linkText=self.gettextonly(link)
self.addlinkref(page,url,linkText)
self.dbcommit()
pages=newpages
你是從哪裏複製這段代碼的? –
您的代碼沒有正確縮進(看起來像是複製/粘貼給我),您應該查看http://docs.python.org/release/2.5.1/ref/indentation.html以瞭解有關Python中的正確縮進。 – Amyth
嘗試縮進代碼bro。讓我們從很多頭痛中解脫出來。 – kotAPI