I was trying to follow the Machine Learning class of Andrew Ng and was soon got overwhelmed with all the mathematical representations and the equations. I decided to take small baby steps and understand the concepts first and then try out the same in R.

So, here goes the first exercise on Linear Regression with one variable in R. Here, we will predict profits of a food truck based on the population of the cities.

We will divide this exercise in the below steps:

Let's take a look at the data structure.

Let's first visualize the data and try to fit in a line through the data.

Below is the plot:

Y = h(𝛉) = 𝛉1 + 𝛉2X

Y = output variable/target variable(Profit)

X = input variable(Population)

m = number of training examples(in this example, it is 97)

J(θ0,θ1)=12m∑i=1m(y^i−yi)2

Let's create the cost function J(𝛉1,𝛉2) in R.

First, we will set the value of 𝛉1 = 0 and 𝛉2 = 0 and will calculate the cost.

Thus, the value of the cost J(0,0) is

Rpeat until convergence: {θ0:=θ1:=}θ0−α

So, here goes the first exercise on Linear Regression with one variable in R. Here, we will predict profits of a food truck based on the population of the cities.

We will divide this exercise in the below steps:

- Load Data
- Plot the Data
- Create a Cost Function
- Create a Gradient Descent Function
- Plot the Gradient Descent Results

**For this exercise, the Test Data is provided in a Text file with 2 columns and 97 records. x refers to the population size in 10,000s and y refers to the profit in $10,000s.**__1. Load Data:__Let's take a look at the data structure.

**2. Plot the data:**Let's first visualize the data and try to fit in a line through the data.

Below is the plot:

**3. Create a Cost**__As we already know, a Linear Regression Model is represented by a linear equation like below:__**Function:**Y = h(𝛉) = 𝛉1 + 𝛉2X

Y = output variable/target variable(Profit)

X = input variable(Population)

m = number of training examples(in this example, it is 97)

𝛉1= the Intercept

𝛉2= the slope

We need to find out the line that fits best with our current data set. To get the best fit for this line, we need to choose the best values for 𝛉1 and 𝛉2. We can measure the accuracy of our prediction by using a cost function J(𝛉1,𝛉2). At this step, we will create this Cost Function so that we can check the convergence of our Gradient Descent Function. The details on the mathematical representation of a linear regression model are here. The equation for the cost function is as below:J(θ0,θ1)=12m∑i=1m(y^i−yi)2

Let's create the cost function J(𝛉1,𝛉2) in R.

First, we will set the value of 𝛉1 = 0 and 𝛉2 = 0 and will calculate the cost.

Thus, the value of the cost J(0,0) is

**32.07273**.

**3. Create a Gradient Descent**__Next, we will create a Gradient Descent Function to minimize the value of the cost function J(𝛉1,𝛉2). We will start with some value of 𝛉1 and 𝛉2 and keep on changing the values until we get the Minimum value of J(𝛉1,𝛉2) i.e. best fit for the line that passes through the data points.Gradient Descent equation for a linear regression model is as below:__**Function:**Rpeat until convergence: {θ0:=θ1:=}θ0−α