2011-01-08 29 views

回答

16

您在尋找NumPy(矩陣運算和數字運算)和SciPy(優化)。 要開始,請參閱https://stackoverflow.com/questions/4375094/numpy-learning-resources

我計算出給定的例子如下:

  • 我打開示例Excel文件中的OpenOffice
  • 我的團隊的數據(不包括標題)複製到一個新的片和在Python保存爲teams.csv
  • 我複製的遊戲數據(無頭)到一個新的工作表,並保存爲games.csv

然後:

import csv 
import numpy 
import scipy.optimize 

def readCsvFile(fname): 
    with open(fname, 'r') as inf: 
     return list(csv.reader(inf)) 

# Get team data 
team = readCsvFile('teams.csv') # list of num,name 
numTeams = len(team) 

# Get game data 
game = readCsvFile('games.csv') # list of game,home,away,homescore,awayscore 
numGames = len(game) 

# Now, we have the NFL teams for 2002 and data on all games played. 
# From this, we wish to forecast the score of future games. 
# We are going to assume that each team has an inherent performance-factor, 
# and that there is a bonus for home-field advantage; then the 
# relative final score between a home team and an away team can be 
# calculated as (home advantage) + (home team factor) - (away team factor) 

# First we create a matrix M which will hold the data on 
# who played whom in each game and who had home-field advantage. 
m_rows = numTeams + 1 
m_cols = numGames 
M = numpy.zeros((m_rows, m_cols)) 

# Then we create a vector S which will hold the final 
# relative scores for each game. 
s_cols = numGames 
S = numpy.zeros(s_cols) 

# Loading M and S with game data 
for col,gamedata in enumerate(game): 
    gameNum,home,away,homescore,awayscore = gamedata 
    # In the csv data, teams are numbered starting at 1 
    # So we let home-team advantage be 'team 0' in our matrix 
    M[0, col]   = 1.0 # home team advantage 
    M[int(home), col] = 1.0 
    M[int(away), col] = -1.0 
    S[col]   = int(homescore) - int(awayscore) 


# Now, if our theoretical model is correct, we should be able 
# to find a performance-factor vector W such that W*M == S 
# 
# In the real world, we will never find a perfect match, 
# so what we are looking for instead is W which results in S' 
# such that the least-mean-squares difference between S and S' 
# is minimized. 

# Initial guess at team weightings: 
# 2.0 points home-team advantage, and all teams equally strong 
init_W = numpy.array([2.0]+[0.0]*numTeams) 

def errorfn(w,m,s): 
    return w.dot(m) - s 

W = scipy.optimize.leastsq(errorfn, init_W, args=(M,S)) 

homeAdvantage = W[0][0] # 2.2460937500005356 
teamStrength = W[0][1:] # numpy.array([-151.31111318, -136.36319652, ... ]) 

# Team strengths have meaning only by linear comparison; 
# we can add or subtract any constant to all of them without 
# changing the meaning. 
# To make them easier to understand, we want to shift them 
# such that the average is 0.0 
teamStrength -= teamStrength.mean() 

for t,s in zip(team,teamStrength): 
    print "{0:>10}: {1: .7}".format(t[1],s) 

導致

 Ari: -9.8897569 
     Atl: 5.0581597 
     Balt: -2.1178819 
     Buff: -0.27413194 
    Carolina: -3.2720486 
     Chic: -5.2654514 
     Cinn: -10.503646 
     Clev: 1.2338542 
     Dall: -8.4779514 
     Den: 4.8901042 
     Det: -9.1727431 
     GB: 3.5800347 
     Hous: -9.4390625 
     Indy: 1.1689236 
     Jack: -0.2015625 
     KC: 6.1112847 
    Miami: 6.0588542 
     Minn: -3.0092014 
     NE: 4.0262153 
     NO: 2.4251736 
     NYG: 0.82725694 
     NYJ: 3.1689236 
     Oak: 10.635243 
     Phil: 8.2987847 
     Pitt: 2.6994792 
St. Louis: -3.3352431 
San Diego: -0.72065972 
     SF: 0.63524306 
    Seattle: -1.2512153 
    Tampa: 8.8019097 
     Tenn: 1.7640625 
     Wash: -4.4529514 

這是在電子表格中所示的相同的結果。

+0

謝謝。這是我正在尋找的。 – haha 2011-01-09 12:30:32

3

此頁面列出了一些Python的求解器庫,你可以使用:

0

你可能要考慮Pyspread,完全用Python編寫的電子表格應用程序。單個單元可以保存Python表達式並可以訪問所有Python模塊。

0

PuLP是python中的線性規劃建模器。它可以完成Excel解算器可以完成的任何事情。

PuLP是一個用Python編寫的免費開源軟件。它被用於將優化問題描述爲數學模型。然後,PuLP可以調用任意多種外部LP解算器(CBC,GLPK,CPLEX,Gurobi 等)來解決此模型,然後使用python命令操作 並顯示解決方案。

有一個detailed introduction about PuLP和一本關於如何在python中用PuLP建模優化問題的手冊。

造型例如

# Import PuLP modeler functions 
from pulp import * 

# Create the 'prob' variable to contain the problem data 
prob = LpProblem("Example_Problem", LpMinimize) 

# Declare decision variables 
var_x = LpVariable(name="x", lowBound=0, cat="Continuous") 
var_y = LpVariable(name="y", cat="Integer") 

# The objective function is added to 'prob' first 
prob += var_x + 2 * var_y 

# The constraints are added to 'prob' 
prob += var_x == (-1) * var_y 
prob += var_x <= 15 
prob += var_x > 0 

# The problem is solved using PuLP's choice of Solver 
prob.solve() 

# The status of the solution is printed to the screen 
print("Status:", LpStatus[prob.status]) 

# Each of the variables is printed with it's resolved optimum value 
for v in prob.variables(): 
    print(v.name, "=", v.varValue)