2016-04-20 69 views
0

爲什麼我只能從PLAYER_NAME中的最後一名玩家那裏獲得統計信息?在Python腳本中循環,僅獲得最後結果

我想從PLAYER_NAME的所有玩家獲得統計數據。

import csv 
import requests 
from bs4 import BeautifulSoup 
import urllib 

PLAYER_NAME = ["andy-murray/mc10", "rafael-nadal/n409"] 
URL_PATTERN = 'http://www.atpworldtour.com/en/players/{}/player-stats?year=0&surfaceType=clay' 
for item in zip (PLAYER_NAME): 
    url = URL_PATTERN.format(item) 

    response = requests.get(url) 
    html = response.content 
    soup = BeautifulSoup(html) 
    table = soup.find('div', attrs={'class': 'mega-table-wrapper'}) 

    list_of_rows = [] 
    for row in table.findAll('tr'): 
     list_of_cells = [] 
     for cell in row.findAll('td'): 
      text = (cell.text.encode("utf-8").strip()) 
      list_of_cells.append(text) 
     list_of_rows.append(list_of_cells) 


outfile = open("./tennis.csv", "wb") 
writer = csv.writer(outfile) 
writer.writerow(["Name", "Stat"]) 
writer.writerows(list_of_rows) 
+5

你在每次迭代過'PLAYER_NAME'再造'list_of_rows'。 – CTKlein

+0

我該如何解決這個問題? – Depekker

+0

將list_of_rows定義移至循環的外部 – georgealton

回答

2

正如評論中提到的那樣,您每次都會重新創建list_of_rows。爲了解決這個問題,你必須將它移到for循環之外,而不是附加到它,並將它轉換爲列表列表,將其擴展。

在一個側面說明,你有你的代碼的一些其他問題:

  • zip是多餘的,它實際上最終將您的姓名到的元組,這將導致不正確的格式,你只是想迭代過PLAYER_NAME,雖然你在它,也許重命名爲PLAYER_NAMES(因爲它是一個名稱列表)
  • 當試圖格式化字符串,你只需要有空括號,你需要在那裏指定位置的數字的論點format - 在這種情況下{0}


PLAYER_NAMES = ["andy-murray/mc10", "rafael-nadal/n409"] 
URL_PATTERN = 'http://www.atpworldtour.com/en/players/{0}/player-stats?year=0&surfaceType=clay' 
list_of_rows = [] 
for item in PLAYER_NAMES: 
    url = URL_PATTERN.format(item) 

    response = requests.get(url) 
    html = response.content 
    soup = BeautifulSoup(html) 
    table = soup.find('div', attrs={'class': 'mega-table-wrapper'}) 

    # for row in table.findAll('tr'): 
    #  list_of_cells = [] 
    #  for cell in row.findAll('td'): 
    #   text = (cell.text.encode("utf-8").strip()) 
    #   list_of_cells.append(text) 
    #  list_of_rows.extend(list_of_cells) # Change to extend here 

    # Incidentally, the for loop above could also be written as: 
    list_of_rows += [ 
     [cell.text.encode("utf-8").strip() for cell in row.findAll('td')] 
     for row in table.findAll('tr') 
    ] 
+0

請大家提供反饋意見和建議。 – Depekker

+0

@Depekker看到更新的回覆 – Bahrom

相關問題