2012-04-05 26 views
0

我有一個家庭作業要做,通過csv和函數讀取文件。Python CSV作業程序

基本的想法是計算足球運動員在兩年內的衝擊者等級。我們使用來自給我們的文件中的數據。一個範例文件將是:

 
name, ,pos,team,g,rush,ryds,rtd,rtdr,ravg,fum,fuml,fpts,year 
A.J.,Feeley,QB,STL,5,3,4,0,0,1.3,3,2,20.3,2011 
Aaron,Brown,RB,DET,1,1,0,0,0,0,0,0,0.9,2011 
Aaron,Rodgers,QB,GB,15,60,257,3,5,4.3,4,0,403.4,2011 
Adrian,Peterson,RB,MIN,12,208,970,12,5.8,4.7,1,0,188.9,2011 
Ahmad,Bradshaw,RB,NYG,12,171,659,9,5.3,3.9,1,1,156.6,2011 

換句話說,我們必須去除從文件的第一行,並讀取線中的逗號其餘部分,分裂。

要計算拉什的評價,我們需要:

YDS是每企圖平均碼增益。這是[總碼數/(4.05 *次嘗試)]。如果這個數字大於2.375,那麼應該使用2.375。

perTDs是每次進位達陣的百分比。這是[(39.5 *達陣)/次]。如果這個數字大於2.375,那麼2.375應該使用insted。

perFumbles是每次進行冒泡的百分比。這是[2.375 - ((21.5 * fumbles)/次)]。

衝擊者評分是[Yds + perTDs + perFumbles] *(100/4.5)。

的代碼,我到目前爲止有:

playerinfo = [] 
teaminfo10 = [] 
teaminfo11 = [] 

import csv 

file = raw_input("Enter filename: ") 
read = open(file,"rU") 
read.readline() 
fileread = csv.reader(read) 

#Each line is iterated through, and if rush attempts are greater than 10, the 
#player may be used for further statistics. 
for playerData in fileread: 
    if int(playerData[5]) > 10: 

     attempts = int(playerData[5]) 
     totalYards = int(playerData[6]) 
     touchdowns = int(playerData[7]) 
     fumbles = int(playerData[10]) 

     #Rusher rating for each player is found. This rating, coupled with other 
     #data about the player is formatted and appended into a list of players. 
     rushRating = ratingCalc(attempts,totalYards,touchdowns,fumbles) 
     rusherData = rushFunc(playerData,rushRating) 
     playerinfo.append(rusherData) 

     #Different data about the player is formatted and added to one of two 
     #lists of teams, based on year. 
     teamData = teamFunc(playerData) 
     if playerData[13] == '2010': 
      teaminfo10.append(teamData) 
     else: 
      teaminfo11.append(teamData) 

#The list of players is sorted in order of decreasing rusher rating. 
playerinfo.sort(reverse = True) 
#The two team lists of players are sorted by team. 
teaminfo10.sort() 
teaminfo11.sort() 

print "The following statistics are only for the years 2010 and 2011." 
print "Only those rushers who have rushed more than 10 times are included." 
print 
print "The top 50 rushers based on their rusher rating in individual years are:" 

#50 players, in order of decreasing rusher ratings, are printed along with other 
#data. 
rushPrint(playerinfo,50) 

#A similar list of running backs is created, in order of decreasing rusher 
#ratings. 
RBlist = [] 
for player in playerinfo: 
    if player[2] == 'RB': 
     RBlist.append(player) 

print "\nThe top 20 running backs based on their rusher rating in individual\ 
years are:" 
#The top 20 running backs on the RBlist are printed, with other data. 
rushPrint(RBlist,20) 


#The teams with the greatest overall rusher rating (if their attempts are 
#greater than 10) are listed in order of decreasing rusher rating, for both 2010 
#and 2011. 
teamListFunc(teaminfo10,'2010') 

teamListFunc(teaminfo11,'2011') 

#The player(s) with the most yardage is printed. 
yardsList = mostStat(6,fObj,False) 
print "\nThe people who rushed for the most yardage are:" 
for item in yardsList: 
    print "%s rushing for %d yards for %s in %s."\ 
    % (item[1],item[0],item[2],item[3]) 

#The player(s) with the most touchdowns is printed. 
TDlist = mostStat(7,fObj,False) 
print"\nThe people who have scored the most rushing touchdowns are:" 
for item in TDlist: 
    print "%s rushing for %d touchdowns for %s in %s."\ 
    % (item[1],item[0],item[2],item[3]) 

#The player(s) with the most yardage per rushing attempt is printed. 
ypaList = mostStat(6,fObj,True) 
print"\nThe people who have the highest yards per rushing attempt with over 10\ 
rushes are:" 
for item in ypaList: 
    print "%s with a %.2f yards per attempt rushing average for %s in %s."\ 
    % (item[1],item[0],item[2],item[3]) 

#The player(s) with the most fumbles is printed. 
fmblList = mostStat(10,fObj,False) 
print"\nThere are %d people with the most fumbles. They are:" % (len(fmblList)) 
for item in fmblList: 
    print "%s with %d fumbles for %s in %s." % (item[1],item[0],item[2],item[3]) 


def ratingCalc(atts,totalYrds,TDs,fmbls): 
    """Calculates rusher rating.""" 
    yrds = totalYrds/(4.05 * atts) 
    if yrds > 2.375: 
     yrds = 2.375 

    perTDs = 39.5 * TDs/atts 
    if perTDs > 2.375: 
     perTDs = 2.375 

    perFumbles = 2.375 - (21.5 * fmbls/atts) 

    rating = (yrds + perTDs + perFumbles) * (100/4.5) 

    return rating  

def rushFunc(information,rRating): 
    """Formats player info into [rating,name,pos,team,yr,atts]""" 
    rusherInfo = [] 
    rusherInfo.append(rRating) 
    name = information[0] + ' ' + information[1] 
    rusherInfo.append(name) 
    rusherInfo.append(information[2]) 
    rusherInfo.append(information[3]) 
    rusherInfo.append(information[13]) 
    rusherInfo.append(information[5]) 

    return rusherInfo 


def teamFunc(plyrInfo): 
    """Formats player info into [team,atts,yrds,TDs,fmbls] for team sorting""" 
    teamInfo = [] 
    teamInfo.append(plyrInfo[3]) 
    teamInfo.append(plyrInfo[5]) 
    teamInfo.append(plyrInfo[6]) 
    teamInfo.append(plyrInfo[7]) 
    teamInfo.append(plyrInfo[10]) 

    return teamInfo 

def rushPrint(lst,num): 
    """Prints players and their data in order of rusher rating.""" 
    print "Name       Pos Year Attempts Rating Team" 
    count = 0 
    while count < num: 
     index = lst[count] 
     print "%-30s %-5s %4s %5s  %3.2f %s"\ 
       % (index[1],index[2],index[4],index[5],index[0],index[3]) 
     count += 1 

所以,是的,還有很多,我要定義的功能。但是,到目前爲止,您對代碼的看法如何?它效率低下嗎?你能告訴我它有什麼問題嗎?因爲它看起來像這樣的代碼將會非常長(如300行左右),但老師說這應該是一個相對較短的項目。

+1

我認爲你的最終計劃應該在300線以內?目前看起來相當不錯。此外,在您的初始項目中,使用乾淨的代碼比很多簡短但難以理解的代碼更好。但很快你就會拿起像正則表達式,列表理解和其他東西。 – George 2012-04-05 21:58:44

+0

你可能會考慮使用[classes](http://docs.python.org/tutorial/classes.html)。你可以創建一個名爲'Player'的類來加載單個玩家的數據,然後有[methods](http://en.wikipedia。org/wiki/Method_%28computer_programming%29)來計算你想要的各種統計數據。 – 2012-04-05 22:08:58

+0

http://codereview.stackexchange.com將會是更好的地方 – georg 2012-04-05 22:15:30

回答

3

下面是一段代碼,它應該可以大大簡化整個項目。

這可能需要一點點了解手頭的任務,但就整體而言,這會讓你的生活變得更輕鬆,當你處理正確的數據類型和「關聯數組」(類型的字典)

import csv 

reader = csv.DictReader(open('mycsv.txt', 'r')) 
#opens the csv file into a dictionary 

list_of_players = map(dict, reader) 
#puts all the dictionaries (by row) as a separate element in a list. 
#this way, its not a one-time iterator and all your info is easily accessible 

for i in list_of_players: 
    for stat in ['rush','ryds','rtd','fum','fuml','year']: 
     i[stat] = int(i[stat]) 
    #the above loop makes all the intended integers..integers instead of strings 
    for stat in ['fpts','ravg','rtdr']: 
     i[stat] = float(i[stat]) 
    #the above loop makes all the intended floats..floats instead of strings 

for i in list_of_players: 
    print i['name'], i[' '], i['fpts'] 
    #now you can easily access and loop through your players with meaningful names 
    #using 'fpts' rather than predetermined numbers [5] 

此示例代碼顯示它是多麼容易與他們的名字和他們的統計,即名字,姓氏和fpts工作:

>>> 
A.J. Feeley 20.3 
Aaron Brown 0.9 
Aaron Rodgers 403.4 
Adrian Peterson 188.9 
Ahmad Bradshaw 156.6 

一些調整將是必需的,當然,讓所有的請求的統計(最大,等等),但這使得通過保持你的d來減少冗長的任務atatypes從一開始就是正確的。

這個任務現在可以完成(使用這些結構),遠遠少於300行,而且使用python越多,您將學習完成它們的傳統習慣用法。 lambda和sorted()都是你將會愛上的功能的例子!

+0

所需閱讀的必要鏈接:[像蟒蛇一樣的代碼 - 有趣的Python。](http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html) – 2012-05-19 18:03:25