3
我想從以下wikipedia page檢索3列(NFL團隊,玩家姓名,大學團隊)。我是python的新手,一直在嘗試使用beautifulsoup來完成這個任務。我只需要屬於QB的列,但我甚至無法獲得所有列,儘管位置。這是我迄今爲止所做的,它什麼都不輸出,我不完全確定爲什麼。我相信這是由於一個標籤,但我不知道要改變什麼。任何幫助將不勝感激。'Wikipedia使用Python刮臉
wiki = "http://en.wikipedia.org/wiki/2008_NFL_draft"
header = {'User-Agent': 'Mozilla/5.0'} #Needed to prevent 403 error on Wikipedia
req = urllib2.Request(wiki,headers=header)
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)
rnd = ""
pick = ""
NFL = ""
player = ""
pos = ""
college = ""
conf = ""
notes = ""
table = soup.find("table", { "class" : "wikitable sortable" })
#print table
#output = open('output.csv','w')
for row in table.findAll("tr"):
cells = row.findAll("href")
print "---"
print cells.text
print "---"
#For each "tr", assign each "td" to a variable.
#if len(cells) > 1:
#NFL = cells[1].find(text=True)
#player = cells[2].find(text = True)
#pos = cells[3].find(text=True)
#college = cells[4].find(text=True)
#write_to_file = player + " " + NFL + " " + college + " " + pos
#print write_to_file
#output.write(write_to_file)
#output.close()
我知道它有很多評論它,因爲我試圖找到故障是在哪裏。