November 9th 2020

Today’s blog explores another vital statistical concept Linear Regression, let’s begin. Linear regression is normally used in statistics for predictive modeling. It tries to model a relationship between two independent (explanatory variable) and dependent (explained variable) variables X and Y by fitting a linear equation (Y=b_o+b₁X+U_i) to an observed data.

Building Empires: The Best Sandbox Ga…
Units of measurement â€“ Reference gu…
Amazon is bringing ads to Prime Video…
Underdub 10 Years by Various Artists …
Marriage and Couples Counseling: Reco…

Assumptions of linear regression

U_i is a random real variable, where U_iis the difference between the observed dependent variable Y and predicted Y variable.
The mean of U_iin any particular period is zero.
The variance of U_i is constant in each period i.e for all values of X, U_i will show the same dispersion around their mean
The variable U_i has a normal distribution i.e the value of U_i (for each X_i) have a bell shaped symmetrical distribution about their zero mean.
The random terms of different observations are independent i.e the covariance of any U_iwith any other U_j is equal to zero.
U_i is independent of the explanatory variable X.
X_iare a set of fixed values in the hypothesised process of repeated sampling which underlies the linear regression model.
In case there are more than one explanatory variables then they are not perfectly linearly correlated.

Linear Regression equation can be written as:

Where,

is the dependent variable

X is the independent variable.

b₀is the intercept (where the line crosses the vertical y-axis)

b₁is the slope

U_i is the error term (difference between ) also called residual or white noise.

Simple linear regression follows the properties of Ordinary Least Square (OLS) which are as follows:-

Unbiased estimator:- E()=b ie. an estimator is unbiased if its bias is 0; E() – b = 0
Minimum Variance:- An estimate is best when it has the smallest variance as compared to any other estimate obtained from other econometric method.
Efficient estimator:- When it has both the previous properties ie.
Linear estimator
Best, Linear, Unbiased estimator (BLUE)
Minimum mean squared error (MSE) estimator:- It is a combination of the unbiasedness and minimum variance properties. An estimator is a minimum MSE estimator if it has the smallest mean square error.

With that the discussion on Linear Regression wraps up here, hopefully it cleared away any confusion you might have and helped you get a grasp on the concept. We have a video discussion on this same topic, which is attached below this blog, check it out for further reference.

Continue to track the DexLab Analytics blog to find informative posts related to Python for data science training.

The post Linear Regression Part I: A Comprehensive Guide to Linear Regression appeared first on DexLab Analytics | Big Data Hadoop SAS R Analytics Predictive Modeling & Excel VBA.

This post first appeared on Discover The Best Industries To Have A Career In Data Science, please read the originial post: here