1
我目前正試圖報廢ATP(網球協會)網站,並面臨一個我無法解決的問題。美麗的湯不處理超過2700行的源代碼?
當我嘗試刪除位於行號2700後面的行時,出現錯誤。
有沒有辦法解決這個問題?
這裏是我的代碼(該代碼完全適用於先前線):
# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
from urllib2 import urlopen
import sys
BASE_URL = "http://www.atpworldtour.com/Share/Event-Draws.aspx?e=540&y=2012"
def make_soup(url):
html = urlopen(url).read()
return BeautifulSoup(html, "lxml")
def get_player_name_third_round_winner(section_url):
soup = make_soup(section_url)
colonne4 = soup.find("td", "col_4")
playerWrap = colonne4.findAll("div", "playerWrap")
for name in playerWrap:
print name.find("a").string
def get_player_score_third_round_winner(section_url):
soup = make_soup(section_url)
colonne4 = soup.find("td", "col_4")
scores = colonne4.findAll("div", "scores")
for score in scores:
print score.find("a").string
get_player_name_third_round_winner(BASE_URL)
get_player_score_third_round_winner(BASE_URL)
這裏是顯示錯誤:
Traceback (most recent call last):
File "/Users/Me/Desktop/ATP/atp_col4", line 27, in <module>
get_player_name_third_round_winner(BASE_URL)
File "/Users/Me/Desktop/ATP/atp_col4", line 16, in get_player_name_third_round_winner
playerWrap = colonne4.findAll("div", "playerWrap")
AttributeError: 'NoneType' object has no attribute 'findAll'
[Finished in 1.6s with exit code 1]
你的代碼工作正常。 http://asciinema.org/a/7539 – falsetru
我只是試了一遍,我仍然得到同樣的錯誤,我不明白。 – mandok
我試了幾次你的代碼:有時它確實有效,有時它會產生你發佈的異常。因此,對於每個請求,「BASE_URL」頁面似乎返回略微不同的結果。 –