Linear Regression Formula

Bookmark added to your notes.
View Notes
×

Linear regression is known to be the most basic and commonly used predictive analysis. In this concept, one variable is considered to be an explanatory variable, and the other variable is considered to be a dependent variable. For example, a modeler might want to relate the weights of individuals to their heights using the concept of linear regression.


Simple Linear Regression

  • One is the dependent variable (that is interval or ratio).

  • One is the independent variable (that is interval or ratio or dichotomous).

Multiple Linear Regression

  • One is the dependent variable (that is interval or ratio).

  • Two or more independent variables ( that is interval or ratio or dichotomous).

Logistic Regression

  • One is the dependent variable (that is binary).

  • Two or more independent variable(s) ( that is interval or ratio or dichotomous).

Ordinal Regression

  • One is the dependent variable (that is ordinal).

  • One or more independent variable(s) (that is nominal or dichotomous).

Multinomial Regression

  • One is the dependent variable (that is nominal).

  • One or more independent variable(s) (that is interval or ratio or dichotomous).

Discriminant Analysis

  • One is the dependent variable (that is nominal).

  • One or more independent variable(s) (that is interval or ratio).

What is Linear Regression?

Let’s know what linear regression is. It is very important and used for easy analysis of the dependency of two variables. One variable will be considered to be an explanatory variable, while others will be considered to be a dependent variable. Linear regression is a linear method for modeling the relationship between the independent variables and dependent variables. The linearity of the learned relationship makes the interpretation very easy. Linear regression models have long been used by people as statisticians, computer scientists, etc. who tackle quantitative problems. For example, a statistician might want to relate the weights of individuals to their heights using a linear regression model.Now we know what is linear regression.


The Formula of Linear Regression

Let’s know what is linear regression equation. The formula for linear regression equation is given by:

y = a + bx


a and b can be computed by the following formulas:

\[b = \frac{n \sum xy - (\sum x)(\sum y)}{n \sum x^{2} - (\sum x)^{2}}\]

\[a = \frac{\sum y - b(\sum x)}{n}\]

Where,

x and y are the variables for which we will make the regression line.

  • b =  Slope of the line.

  • a =  Y-intercept of the line.

  • X  = Values of the first data set.

  • Y = Values of the second data set.

Note: The first step in finding a linear regression equation is to determine if there is a relationship between the two variables. This is often a judgment call for the researcher. You’ll also need a list of your data in an x-y format (i.e. two columns of data - independent and dependent variables).


Simple Linear Regression Formula Plotting

Table 1. Example data.


X

Y

1.00

1.00

2.00

2.00

3.00

1.30

4.00

3.75

5.00

2.25


(image will be uploaded soon)


The concept of linear regression consists of finding the best-fitting straight line through the given points. The best-fitting line is known as a regression line. The black diagonal line in the figure given below (Figure 2) is the regression line and consists of the predicted score on Y for each possible value of the variable X. The lines in the figure given above, the vertical lines from the points to the regression line represent the errors of prediction. As you can see, the red point is actually very near the regression line; we can see its error of prediction is small. By contrast, the yellow point we can see it is much higher than the regression line and therefore its error of prediction is large.


(image will be uploaded soon)


The black line given in the figure consists of the predictions, the points that are the actual data, and the vertical lines between the points and the black line represent errors of prediction.


Properties of Linear Regression

For the regression line where the regression parameters b0 and b1 are defined, the properties are given as:

  • The line reduces the sum of squared differences between observed values and predicted values.

  • The regression line passes through the mean of X and Y variable values

  • The regression constant (b0) is equal to y-intercept the linear regression

  • The regression coefficient (b0) is the slope of the regression line which is equal to the average change in the dependent variable (Y) for a unit change in the independent variable (X).

What is Linear Regression Used for?

Let’s know what is linear regression used for:

  • The concept of studying engine performance from test data in automobiles.

  • Linear regression can be used in market research studies and customer survey results analysis.

  • Linear regression can be used in observational astronomy commonly enough. A number of statistical tools and methods can be used in astronomical data analysis, and there are entire libraries in languages like Python meant to do data analysis in astrophysics.

  • Linear regression can also be used to analyze the marketing effectiveness, pricing, and promotions on sales of a product.

Questions to be Solved

Question 1) Find out the linear regression equation from the given set of data.


X

2

3

5

8

Y

3

6

5

12


Solution)


X

Y

X2

XY

2

3

4

6

3

6

9

18

5

5

25

25

8

12

64

96

Sum  = 18

Sum = 26

Sum =102

Sum = 145


Using the simple linear regression formula,

\[b = \frac{n \sum xy - (\sum x)(\sum y)}{n \sum x^{2} - (\sum x)^{2}}\],

\[b = \frac{4 \times 145 - 18 \times 26}{4 \times 102 - 324}\], Value of b is equal to 1.33.

Now using the simple linear regression formula to calculate the value of \[a = \frac{\sum y - b(\sum x)}{n}\]

= \[\frac{26 - 1.33 \times 18}{4} = 0.515\]

Putting the values of a and b in the equation,y = a + bx

Answer: y = 0.515 + 1.33x.

FAQ (Frequently Asked Questions)

Question 1) What is a Linear Regression with an Example?

Answer) Linear regression quantifies the relationship between one or more predictor variable(s) and one outcome variable. ... For example, it can be used to quantify the relative impacts of age, gender, and diet (the predictor variables) on height (the outcome variable).

Question 2) How do you Calculate Linear Regression?

Answer) The Linear Regression Equation

The equation has the form Y= a + bX, where Y is the dependent variable (that's the variable that goes on the Y-axis), X is the independent variable (i.e. it is plotted on the X-axis), b is the slope of the line and a is the y-intercept.

Question 3) How do you Calculate the Y-Intercept?

Answer) Using the "slope-intercept" form of the line's equation (y = mx + b), you solve for b (which is the y-intercept you're looking for). You need to substitute the known slope for the variable m, and substitute the known point's coordinates for x and y, respectively, in the slope-intercept equation. That will help you find b.

Question 4) What is a Regression Model Example?

Answer) A simple linear regression plot for the amount of rainfall. Regression analysis can also be used in statistics to find trends in data (insights). For example, you might guess that there's a connection between how much you eat and how much you weigh; regression analysis can help you quantify that.