Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Derive Gradient Descent for a Univariate Linear Regression Model

I think, whoever starts to learn Machine Learning, checks out the Machine Learning Class by Andrew Ng. I was not an exception. But the mathematical equations and all the formulae made me realize how less I remember my +2 maths. I was stuck at each and every step. But again, it opened a whole new world of Machine Learning.

In one of the videos, there is a deduction of Gradient Descent equation for Linear Regression Model. It took me some time and help to figure that out and now I must write it down. So, this post is all about deriving the Gradient Descent Equation for univariate 
Linear Regression model.A Univariate Linear Regression model is represented by a straight line equation as below:

ŷ  = θ0 + θ1x
The better fit of the straight line ensures better prediction of data. The best fit of this straight line equation can be obtained by determining the optimum values of θand θ1. This optimum values of θand θare derived using the Gradient Descent Equation. The mathematical representation of a univariate linear regression model is here.

So, let the fun begin step by step.
The generalized Gradient Descent equation is as below:
θj = θj - α * ∂θj J(θ12)
where j=0,1 represents the feature index number.
J(θ12) is the Cost Function. 
α is a constant called learning rate
J(θ12) = 1/2m∑(ŷ - yi)²) 
where m = Total number of examples in the Training Dataset.
ŷ = the predicted value = θ0 + θ1x
yi = actual value of y for i
∑ = the sum for i = 1 to m
Now, let's replace J(θ12) in the above equation:
θj = θj -α* ∂θj (1/2m∑(ŷ - yi)²)
The next step is to replace ŷ.
ŷ  = θ0 + θ1x
θj = θj -α* ∂θj (1/2m∑(θ0 + θ1xi- yi)²)
Next comes the Derivative part and we will use the below Differentiation Rules.
Thus, the above equation is modified as below:
θj = θj -α/ 2m * ∑∂θj (θ0 + θ1x- yi
Now, we have to apply the below Differentiation Rule:
θj = θj - α/ 2m * ∑ 2(θ0 + θ1xi yi)²-1 * ∂θj(θ0 + θ1xi yi)
       = θj - α/ m * ∑ (θ0 + θ1xi yi) * ∂θj(θ0 + θ1xi yi)
Now we can calculate  θ0 and θ1 values.
Let's start with θ0
θ0 = θ- α/ m *  ∑(θ0 + θ1xi yi)∂θ0(θ0 + θ1xi yi)

Since all the elements except θin the above derivative are treated as constants with respect to θ0,
∂θ0(θ0 + θ1xi yi) = 1 
  θ0    = θ- α/ m *  ∑(θ0 + θ1xi yi)
          = θ- α/ m *  ∑(ŷ yi)
Now we can calculate θ1
θ1 = θ- α/ m *  ∑(θ0 + θ1xi yi)∂θ1(θ0 + θ1xi yi)
Since θand y are constants with respect to θ1, the derivative equation is as below:
∂θ1(θ0 + θ1xi yi) = xi
θ1 = θ- α/ m *  ∑(θ0 + θ1xi yi)* xi
     = θ- α/ m *  ∑(ŷ yi) * xi
Now, we have derived both the values of θ0and θ1 as below:
θ0 = θ- α/ m *  ∑(ŷ yi)
θ1 =   θ- α/ m *  ∑(ŷ yi) * xi

This post first appeared on What The Data Says, please read the originial post: here

Share the post

Derive Gradient Descent for a Univariate Linear Regression Model


Subscribe to What The Data Says

Get updates delivered right to your inbox!

Thank you for your subscription