top of page

Linear Regression Using Stochastic Gradient Descent in Python

Updated: Sep 2, 2021

As Artificial Intelligence is becoming more popular, there are more people trying to understand neural networks and how they work. To illustrate, neural networks are computer systems that are designed to learn and improve, somewhat correlating to the human brain.


In this blog, I will show you guys an example of using Linear Regression in Python (Code at https://github.com/anaypant/LinearRegression/tree/master)


An example of a convolutional neural network. Each neural network takes a certain amount of inputs and returns a certain amount of outputs. In this neural network, there are 3 columns of nodes in the middle. Each column is called a hidden layer. There is no exact answer to how many layers a neural network needs to have. To learn more about neural networks, see this video by 3Blue1Brown to learn more:


In machine learning, there are many different types of algorithms that are used for different tasks. The one I will show you guys today is what is known as Linear Regression. In Linear Regression, the expected input(s) is/are a number, as well for the outputs.

One common example of Linear Regression in everyday life is finding an appropriate value for housing value prices. In this network, there would be multiple inputs (different features of the house), and there would be only one output(the estimated price). Some inputs could be the square feet of the house, the number of bedrooms, number of bathrooms, age in days of the house, etc. The output would then be the appropriate value of the house.

First things first, how will we use the weights (synapses) of our model to find an approximated output? We will do this by using the simple formula:


This formula is a simple yet effective one. For anyone who didn’t know, in this equation, the output(y) is equal to the input(x) times the slope(m) + a bias or y-intercept(b).

For our Linear Regression model, we will fine-tune two parameters in this equation: the slope, and the bias.


 

Coding a Linear Regression Model


Now, we will get to see our Linear Regression model in Python!

— — Editor’s Note: — —

You will NOT need any external libraries/packages for this model. However, this is a very simplistic version, so feel free to add your own twist to the code.

Let’s start with the code:



lass linearregression():    
def __init__(self, x, y, iteration):       
  self.x = x       
  self.y = y       
  self.epoch = iteration       
  self.m = 1       
  self.b = 0       
  self.learn = 0.01

This is the first part of our code. Here, we are initializing a class in python. To describe, a class is like a ‘type’ in python. From this class, one can use it to create multiple ‘objects’.

For more in-depth info on classes, visit this video by CS Dojo:


In this class, we are taking in 3 inputs from the user: the X value, the Y value, and the iterations.

If we were to perceive this from a coordinate plane, the x value would be the x values, and the y values would be the corresponding y coordinates on the plane. Each input is a list of elements, with each corresponding pair of elements creating a coordinate (like in geometry!)

The iterations, in this case, is the number of loops we want our model to train. I have set it to 1000, but you can change that number.

We are also setting some initial values in the __init__ function here. Our most important values are the values m and b. These two values are the tunable values (they will change based on the correction we give them.)

Let’s move on to the next part of our model!


def feed def feedforward(self):   
  self.output = []   
  for i in range(len(self.x)):      
     self.output.append(self.x[i] * self.m + self.b)

In this code, we are creating a feedforward function. Feedforward means to “try out” your model, even if it is not trained. Out “outputs” here are stored in the according to list. (Also notice our equation bring put to use here!)


Feedforward sets the values and necessary steps in order to find out what our model thinks. Then, we will see how wrong we are and learn from it.


 


Onto the next part of our code!


def calculate_error(self):   
self.totalError = 0   
for i in range(len(self.x)):      
self.totalError += (self.y[i] - self.output[i]) ** 2    
self.error **= 1/2
   self.error = float(self.totalError / float(len(self.x)))   
   return self.totalError / float(len(self.x))

In this code, we are calculating the error of our model!

We initialize a total error in this function. This variable will hold the error of our code. How do we do this? Well, we take the average (mean) of all the errors!

In python terms, we iterate with a “for” loop. Through each iteration, we add the error of the true value (y) minus our model’s predicted value (output). We now have the error from our model.

## NOTE:

## In this function, you may notice that we are squaring the error value of each iteration! Why? In the function, we want all of the error values to be positive (negative * negative = positive).

Furthermore, it is not as much the “value” of the error that we seek, but the “magnitude”. When squared, the magnitude of an error increases.’


 

Now, we will move on to the most challenging piece of code:

Stochastic Gradient Descent!!


def gradient_descent(self):   

self.b_grad = 0   

self.m_grad = 0   

N = float(len(self.x))

for i in range(len(self.x)):   

self.b_grad += -(2/N) * (self.y[i] - ((self.m * self.x[i]) + self.b))   

self.m_grad += -(2/N) * self.x[i] * (self.y[i] - ((self.m * self.x[i]) + self.b))

self.m -= self.learn * self.m_grad

self.b -= self.learn * self.b_grad

Now, I know this looks tricky (it's not when you get used to it!)


Let’s take this one step at a time.


Stochastic Gradient Descent is a function that iterates over a value and swiftly minimizes error. How? Basically, it creates a mini bath size that then gets manipulated by our model. We try out multiple different ways to optimize them, then find the quickest one.


Here is a picture of the formula used in stochastic gradient descent:

At first, I was lost with this subject, so I referred to Wikipedia and other websites to help me :



First, we initialize two main factors: the m_grad (gradient for slope), and the b_grad (gradient for b). These will be the minimalized values computed by our function here.


To make it not as mind-boggling as it looks in the picture above, I’ll try my best to make it easier.


FOR THE BIAS (B): We have out equation y = m * x + b, aka our outputs. We then find the simple error of y minus the output and multiply it by a preset derivative found by the geniuses of the world(-2/N, where N is the number of elements in our list).


FOR THE SLOPE (M): This one is literally the same as the formula for the bias, but we multiply the output by the inputs[x]. We iterate over EVERY input for both, so we add to each of them (this is why we are NOT using the error variable from earlier right now, as it only has positive values.)


And that’s stochastic gradient descent! (Kind of… )


If you are still clueless (and I don’t blame you), visit this video by 3blue1brown.



Next Part of the code!

def backprop(self):

self.error = self.calculate_error()

self.gradient_descent()

This is pretty simple code; we are just calling our earlier functions and grouping them into this backpropagation function.


Backpropagation is the generic term for finding the error of your model, and then making the necessary changes to learn from it.


Now… THE FINAL FULL (also available on my github page at the beginning)


import matplotlib.pyplot as plt

class linearregression():

def __init__(self, x, y, iteration):

# Initializing some variables

self.x = x

self.y = y

self.epoch = iteration

# These are the tuned parameters to make our model better

self.m = 1

self.b = 0

# This is how fast we want out model to grow

self.learn = 0.01

def feedforward(self):

self.output = []

for i in range(len(self.x)):

self.output.append(self.x[i] * self.m + self.b)

def calculate_error(self):

self.totalError = 0

self.trueError = 0

for i in range(len(self.x)):

self.totalError += (self.y[i] - self.output[i]) ** 2 # The reason we square is for all error values to be positive.

self.error = float(self.totalError / float(len(self.x)))

return self.totalError / float(len(self.x))

def gradient_descent(self):

self.b_grad = 0

self.m_grad = 0

N = float(len(self.x))

for i in range(len(self.x)):

self.b_grad += -(2/N) * (self.y[i] - ((self.m * self.x[i]) + self.b))

self.m_grad += -(2/N) * self.x[i] * (self.y[i] - ((self.m * self.x[i]) + self.b))

self.m -= self.learn * self.m_grad

self.b -= self.learn * self.b_grad

def backprop(self):

self.error = self.calculate_error()

self.gradient_descent()

def predict(self):

while True: # This can be taken off if you don't want to predict values forever

self.user = input("\nInput an x value. ")

self.user = float(self.user)

self.ret = self.user * self.m + self.b

print("Expected y value is: " + str(self.ret))

def train(self):

for i in range(self.epoch):

self.feedforward()

self.backprop()

self.predict()

Long code, right? Well, not really.


Anyways, to sum it up, this linear regression model can take in many inputs and return an output with the right value! For more info look into my Github.


Thanks for Reading!



318 views0 comments
bottom of page