Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Calculate Cost Function and Gradient Descent for Univariate Linear Regression in R

I was trying to follow the Machine Learning class of Andrew Ng and was soon got overwhelmed with all the mathematical representations and the equations. I decided to take small baby steps and understand the concepts first and then try out the same in R.

So, here goes the first exercise on Linear Regression with one variable in R. Here, we will predict profits of a food truck based on the population of the cities.

We will divide this exercise in the below steps:
  1. Load Data
  2. Plot the Data
  3. Create a Cost Function
  4. Create a Gradient  Descent Function
  5. Plot the Gradient Descent Results
Let's start with the exercise!

1. Load Data: For this exercise, the Test Data is provided in a Text file with 2 columns and 97 records. x refers to the population size in 10,000s and y refers to the profit in $10,000s.

Let's take a look at the data structure.

2. Plot the data:

Let's first visualize the data and try to fit in a line through the data.
Below is the plot:

3. Create a Cost Function:  As we already know, a Linear Regression Model is represented by a linear equation like below:
Y = h(𝛉) = 𝛉1 + 𝛉2X
Y = output variable/target variable(Profit)
X = input variable(Population)
m = number of training examples(in this example, it is 97)

𝛉1= the Intercept
𝛉2= the slope
We need to find out the line that fits best with our current data set. To get the best fit for this line, we need to choose the best values for 𝛉1 and 𝛉2. We can measure the accuracy of our prediction by using a cost function J(𝛉1,𝛉2). At this step, we will create this Cost Function so that we can check the convergence of our Gradient Descent Function. The details on the mathematical representation of a linear regression model are here. The equation for the cost function is as below:

J(θ0,θ1)=12mi=1m(y^iyi)2


Let's create the cost function  J(𝛉1,𝛉2) in R.
First, we will set the value of 𝛉1 = 0 and 𝛉2 = 0 and will calculate the cost.
Thus, the value of the cost J(0,0) is 32.07273.

3. Create a Gradient Descent Function: Next, we will create a Gradient Descent Function to minimize the value of the cost function J(𝛉1,𝛉2). We will start with some value of  𝛉1 and 𝛉2 and keep on changing the values until we get the Minimum value of J(𝛉1,𝛉2) i.e. best fit for the line that passes through the data points.Gradient Descent equation for a linear regression model is as below:

Rpeat until convergence: {θ0:=θ1:=}θ0α


This post first appeared on What The Data Says, please read the originial post: here

Share the post

Calculate Cost Function and Gradient Descent for Univariate Linear Regression in R

×

Subscribe to What The Data Says

Get updates delivered right to your inbox!

Thank you for your subscription

×