21

我想實現這個算法找到單個變量的截距和斜率:簡單線性迴歸在Python

ALGORITHM OF THE LINEAR REGRESSION

這裏是我的Python代碼更新的截距和斜率。但它並不融合。 RSS正在迭代而不是減少,在迭代之後它變得無限。我沒有發現實現該算法的任何錯誤。我可以如何解決這個問題?我也附上了csv文件。 這是代碼。

import pandas as pd 
import numpy as np 

#Defining gradient_decend 
#This Function takes X value, Y value and vector of w0(intercept),w1(slope) 
#INPUT FEATURES=X(sq.feet of house size) 
#TARGET VALUE=Y (Price of House) 
#W=np.array([w0,w1]).reshape(2,1) 
#W=[w0, 
# w1] 

def gradient_decend(X,Y,W): 
    intercept=W[0][0] 
    slope=W[1][0] 

    #Here i will get a list 
    #list is like this 
    #gd=[sum(predicted_value-(intercept+slope*x)), 
    #  sum(predicted_value-(intercept+slope*x)*x)] 
    gd=[sum(y-(intercept+slope*x) for x,y in zip(X,Y)), 
     sum(((y-(intercept+slope*x))*x) for x,y in zip(X,Y))] 
    return np.array(gd).reshape(2,1) 

#Defining Resudual sum of squares 
def RSS(X,Y,W): 
    return sum((y-(W[0][0]+W[1][0]*x))**2 for x,y in zip(X,Y)) 


#Reading Training Data 
training_data=pd.read_csv("kc_house_train_data.csv") 

#Defining fixed parameters 
#Learning Rate 
n=0.0001 
iteration=1500 
#Intercept 
w0=0 
#Slope 
w1=0 

#Creating 2,1 vector of w0,w1 parameters 
W=np.array([w0,w1]).reshape(2,1) 

#Running gradient Decend 
for i in range(iteration): 
    W=W+((2*n)* (gradient_decend(training_data["sqft_living"],training_data["price"],W))) 
    print RSS(training_data["sqft_living"],training_data["price"],W) 

Here是CSV文件。

+0

; p從華盛頓機leanring一流的大學,我把它太,這是非常有趣和啓發。我建議你在coursera上使用論壇,你可以從導師,志願者和同學那裏得到很好的答案。 https://www.coursera.org/learn/ml-regression/discussions – alvas

回答

3

我已經解決我自己的問題!

這是解決的方法。

import numpy as np 
import pandas as pd 
import math 
from sys import stdout 

#function Takes the pandas dataframe, Input features list and the target column name 
def get_numpy_data(data, features, output): 

    #Adding a constant column with value 1 in the dataframe. 
    data['constant'] = 1  
    #Adding the name of the constant column in the feature list. 
    features = ['constant'] + features 
    #Creating Feature matrix(Selecting columns and converting to matrix). 
    features_matrix=data[features].as_matrix() 
    #Target column is converted to the numpy array 
    output_array=np.array(data[output]) 
    return(features_matrix, output_array) 

def predict_outcome(feature_matrix, weights): 
    weights=np.array(weights) 
    predictions = np.dot(feature_matrix, weights) 
    return predictions 

def errors(output,predictions): 
    errors=predictions-output 
    return errors 

def feature_derivative(errors, feature): 
    derivative=np.dot(2,np.dot(feature,errors)) 
    return derivative 


def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance): 
    converged = False 
    #Initital weights are converted to numpy array 
    weights = np.array(initial_weights) 
    while not converged: 
     # compute the predictions based on feature_matrix and weights: 
     predictions=predict_outcome(feature_matrix,weights) 
     # compute the errors as predictions - output: 
     error=errors(output,predictions) 
     gradient_sum_squares = 0 # initialize the gradient 
     # while not converged, update each weight individually: 
     for i in range(len(weights)): 
      # Recall that feature_matrix[:, i] is the feature column associated with weights[i] 
      feature=feature_matrix[:, i] 
      # compute the derivative for weight[i]: 
      #predict=predict_outcome(feature,weights[i]) 
      #err=errors(output,predict) 
      deriv=feature_derivative(error,feature) 
      # add the squared derivative to the gradient magnitude 
      gradient_sum_squares=gradient_sum_squares+(deriv**2) 
      # update the weight based on step size and derivative: 
      weights[i]=weights[i] - np.dot(step_size,deriv) 

     gradient_magnitude = math.sqrt(gradient_sum_squares) 
     stdout.write("\r%d" % int(gradient_magnitude)) 
     stdout.flush() 
     if gradient_magnitude < tolerance: 
      converged = True 
    return(weights) 


#Example of Implementation 
#Importing Training and Testing Data 
# train_data=pd.read_csv("kc_house_train_data.csv") 
# test_data=pd.read_csv("kc_house_test_data.csv") 

# simple_features = ['sqft_living', 'sqft_living15'] 
# my_output= 'price' 
# (simple_feature_matrix, output) = get_numpy_data(train_data, simple_features, my_output) 
# initial_weights = np.array([-100000., 1., 1.]) 
# step_size = 7e-12 
# tolerance = 2.5e7 
# simple_weights = regression_gradient_descent(simple_feature_matrix, output,initial_weights, step_size,tolerance) 
# print simple_weights 
9

首先,我發現,寫學習機代碼時,最好使用複雜的列表理解,因爲什麼,你可以遍歷,

  • 它更容易,如果寫入讀出時的正常循環和壓痕/或
  • 可以與numpy broadcasting

並採用適當的變量名可以幫助您更好地理解代碼來完成。只有你擅長數學,使用X,Y,W作爲短手纔是好事。就個人而言,我不會在代碼中使用它們,特別是在使用python編寫時。從import this顯式優於隱式

我的經驗法則是要記住,如果我寫代碼,我不能在1周後閱讀,這是錯誤的代碼。


首先,讓我們來決定什麼是梯度下降的輸入參數,你將需要:

  • feature_matrix(該X矩陣類型:numpy.array,N * d大小的矩陣,其中N是行數/數據點的數量,D是列/特徵的數量)
  • 輸出Y矢量,類型:numpy.array,大小爲N的向量)
  • initial_weights(類型:numpy.array,大小爲D的向量)。

此外,檢驗收斂你將需要:

  • STEP_SIZE(其他城市的遍歷,當改變權重的大小;類型:float,通常是一個小數目)
  • 公差(打破迭代的標準,當梯度幅度小於公差時,假設您的權重已傳輸,請鍵入:float,通常是一個小數字,但比st大得多ep大小)。

現在的代碼。

def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance): 
    converged = False # Set a boolean to check for convergence 
    weights = np.array(initial_weights) # make sure it's a numpy array 

    while not converged: 
     # compute the predictions based on feature_matrix and weights. 
     # iterate through the row and find the single scalar predicted 
     # value for each weight * column. 
     # hint: a dot product can solve this easily 
     predictions = [??? for row in feature_matrix] 
     # compute the errors as predictions - output 
     errors = predictions - output 
     gradient_sum_squares = 0 # initialize the gradient sum of squares 
     # while we haven't reached the tolerance yet, update each feature's weight 
     for i in range(len(weights)): # loop over each weight 
      # Recall that feature_matrix[:, i] is the feature column associated with weights[i] 
      # compute the derivative for weight[i]: 
      # Hint: the derivative is = 2 * dot product of feature_column and errors. 
      derivative = 2 * ???? 
      # add the squared value of the derivative to the gradient magnitude (for assessing convergence) 
      gradient_sum_squares += (derivative * derivative) 
      # subtract the step size times the derivative from the current weight 
      weights[i] -= (step_size * derivative) 

     # compute the square-root of the gradient sum of squares to get the gradient magnitude: 
     gradient_magnitude = ??? 
     # Then check whether the magnitude is lower than the tolerance. 
     if ???: 
      converged = True 
    # Once it while loop breaks, return the loop. 
    return(weights) 

我希望擴展的僞代碼可以幫助您更好地理解梯度下降。我不會填寫???,以免破壞你的功課。


請注意,您的RSS代碼也是不可讀和不可維護的。它更容易只是做:

>>> import numpy as np 
>>> prediction = np.array([1,2,3]) 
>>> output = np.array([1,1,5]) 
>>> residual = output - prediction 
>>> RSS = sum(residual * residual) 
>>> RSS 
5 

經歷numpy的基礎將很長的路要走,以機器學習和矩陣與向量操作沒有去堅果與迭代:http://docs.scipy.org/doc/numpy-1.10.1/user/basics.html

+0

您可以輕鬆將公差代碼更改爲no。迭代(for循環),只需要改變你如何控制外層循環。但我的偏好是去容忍收斂(while循環)。 – alvas

0

它是如此簡單

def mean(values): 
    return sum(values)/float(len(values)) 

def variance(values, mean): 
    return sum([(x-mean)**2 for x in values]) 

def covariance(x, mean_x, y, mean_y): 
    covar = 0.0 
    for i in range(len(x)): 
     covar+=(x[i]-mean_x) * (y[i]-mean_y) 
    return covar 
def coefficients(dataset): 
    x = [] 
    y = [] 

    for line in dataset: 
     xi, yi = map(float, line.split(',')) 

     x.append(xi) 
     y.append(yi) 

    dataset.close()        

    x_mean, y_mean = mean(x), mean(y) 

    b1 = covariance(x, x_mean, y, y_mean)/variance(x, x_mean) 
    b0 = y_mean-b1*x_mean 

    return [b0, b1] 

dataset = open('trainingdata.txt') 

b0, b1 = coefficients(dataset) 

n=float(raw_input()) 

print(b0+b1*n) 

參考:www.machinelearningmastery.com/implement-simple-linear-regression-scratch-python/