February 9th 2018

I think, whoever starts to learn Machine Learning, checks out the Machine Learning Class by Andrew Ng. I was not an exception. But the mathematical equations and all the formulae made me realize how less I remember my +2 maths. I was stuck at each and every step. But again, it opened a whole new world of Machine Learning.

In one of the videos, there is a deduction of Gradient Descent equation for Linear Regression Model. It took me some time and help to figure that out and now I must write it down. So, this post is all about deriving the Gradient Descent Equation for univariate Linear Regression model.A Univariate Linear Regression model is represented by a straight line equation as below:
ŷ = θ₀ + θ₁x

The better fit of the straight line ensures better prediction of data. The best fit of this straight line equation can be obtained by determining the optimum values of θ₀and θ₁. This optimum values of θ₀and θ₁are derived using the Gradient Descent Equation. The mathematical representation of a univariate linear regression model is here.

So, let the fun begin step by step.
Step1:

The generalized Gradient Descent equation is as below:
θj = θ_j - α * ^∂⁄_{∂θ_j} J(θ₁,θ₂)

where j=0,1 represents the feature index number.
J(θ₁,θ₂) is the Cost Function.
α is a constant called learning rate
Step2:
J(θ₁,θ₂) = 1/2m∑(ŷ - y_i)²)

where m = Total number of examples in the Training Dataset.

ŷ = the predicted value = θ₀ + θ₁x

y_i = actual value of y for i

∑ = the sum for i = 1 to m

Now, let's replace J(θ₁,θ₂) in the above equation:
θj = θ_j -α* ^∂⁄_{∂θ_j} (1/2m∑(ŷ - y_i)²)
Step3:
The next step is to replace ŷ.
ŷ = θ₀ + θ₁x
θj = θ_j -α* ^∂⁄_{∂θ_j} (1/2m∑(θ₀ + θ₁x_i- y_i)²)
Step4:
Next comes the Derivative part and we will use the below Differentiation Rules.

Thus, the above equation is modified as below:
θj = θ_j -α/ 2m * ∑^∂⁄_{∂θ_j} (θ₀ + θ₁x_i- y_i)²
Step5:
Now, we have to apply the below Differentiation Rule:

θj = θ_j - α/ 2m * ∑ 2(θ₀ + θ₁x_i - y_i)²-1 * ^∂⁄_{∂θ_j}(θ₀ + θ₁x_i - y_i)
= θ_j - α/ m * ∑ (θ₀ + θ₁x_i - y_i) * ^∂⁄_{∂θ_j}(θ₀ + θ₁x_i - y_i)
Step6:
Now we can calculate θ_{0 and}θ1values.
Let's start with θ₀
θ_{0 =}θ₀- α/ m *  ∑(θ₀ + θ₁x_i - y_i)^∂⁄_∂θ₀(θ₀ + θ₁x_i - y_i)
Since all the elements except θ₀in the above derivative are treated as constants with respect to θ_0,
^∂⁄_∂θ₀(θ₀ + θ₁x_i - y_i) = 1
Thus,
  θ₀ = θ₀- α/ m *  ∑(θ₀ + θ₁x_i - y_i)
= θ₀- α/ m *  ∑(ŷ - y_i)
Now we can calculate θ₁
θ_{1 =}θ₁- α/ m *  ∑(θ₀ + θ₁x_i - y_i)^∂⁄_∂θ₁(θ₀ + θ₁x_i - y_i)
Since θ₀and y are constants with respect to θ1, the derivative equation is as below:
^∂⁄_∂θ₁(θ₀ + θ₁x_i - y_i) = x_i
Thus,
θ_{1 =}θ₁- α/ m *  ∑(θ₀ + θ₁x_i - y_i)* x_i
= θ₁- α/ m *  ∑(ŷ - y_i) * x_i
Now, we have derived both the values of θ₀and θ1as below:
θ_{0 =}θ₀- α/ m *  ∑(ŷ - y_i)
θ_{1 =}θ₁- α/ m *  ∑(ŷ - y_i) * x_i

The Ultimate Guide to Cloud Gaming: D…
best projectors for home

This post first appeared on What The Data Says, please read the originial post: here

People also like

The Ultimate Guide to Cloud Gaming: Discover the Best Services

best projectors for home

Derive Gradient Descent for a Univariate Linear Regression Model

Related Articles

Derive Gradient Descent for a Univariate Linear Regression Model

Related Articles

Share the post

Subscribe to What The Data Says

Thank you for your subscription