Karl Pearson’s Coefficient of Correlation

Before delving into details about Karl Pearson Coefficient of Correlation, it is vital to brush up fundamental concepts about correlation and its coefficient in general.

Correlation coefficient can be defined as a measure of the relationship between two quantitative or qualitative variables, i.e. X and Y. It serves as a statistical tool that helps to analyse and in turn, measure the degree of the linear relationship between the variables.


For example, a change in the monthly income (X) of a person leads to a change in their monthly expenditure (Y). With the help of correlation, you can measure the degree up to which such a change can impact the other variables.

Fun Fact: Correlation was developed in 1885 by Francis Galton! 

Depending on the direction of the relationship between variables, correlation can be of three types, namely –

  1. Positive Correlation (0 to +1) – In this case, the direction of change between X and Y is the same. For instance, an increase in the duration of workout leads to an increase in the number of calories one burns.

  2. Negative Correlation (0 to -1) – Here, the direction of change between X and Y variables is opposite. For example, when the price of a commodity increases its demand decreases.

  3. Zero Correlation (0) – There is no relationship between the variables in this case. For instance, an increase in height has no impact on one’s intelligence.

Now that we have refreshed our memory of these basics, let’s move onto Karl Pearson Coefficient of Correlation.


What is Karl Pearson’s Coefficient of Correlation?

This method is also known as the Product Moment Correlation Coefficient and was developed by Karl Pearson. It is one of three most potent and extensively used methods to measure the level of correlation, besides the Scatter Diagram and Spearman’s Rank Correlation.


The Karl Pearson correlation coefficient method, is quantitative and offers numerical value to establish the intensity of the linear relationship between X and Y. Such a coefficient correlation is represented as ‘r’.


The Karl Pearson Coefficient of Correlation formula is expressed as -


\[r = \frac{n(\Sigma xy) - (\Sigma x)(\Sigma y)}{\sqrt{[n \Sigma x^{2} - (\Sigma x)^{2}][n \Sigma y^{2} - (\Sigma y)^{2}]}}\]


In this formula,


\[\underline{X}\] is mean of X variable.


\[\underline{Y}\] is mean of Y variable.


What Methods are Used to Calculate Karl Pearson’s Coefficient of Correlation?

The Karl Pearson coefficient can be obtained using four methods, which are mentioned below. 

  1. Actual Mean Method Which is Expressed as -

\[r = \frac{\Sigma xy}{\sqrt{\Sigma x^{2} . \Sigma y^{2}}}\]


In this Karl Pearson formula,


x = (X - \[\underline{X}\])


y = (X - \[\underline{Y}\])

  1. Assumed Mean Method Which is Expressed as -

\[r = \frac{N \Sigma dx.dy - (\Sigma dx)(\Sigma dy)}{\sqrt{N \Sigma dx^{2} - (\Sigma dx)^{2}} \sqrt{N \Sigma dy^{2} - (\Sigma dy)^{2}}}\]


In this Karl Pearson Correlation formula, 

dx = x-series’ deviation from assumed mean, wherein (X - A)

dy = Y-series’ deviation from assumed mean = ( Y - A)

Σdx.dy implies summation of multiple dx and dy.

Σdx2 is summation of square of dx.

Σdy2 is summation of square of dy.

Σdx is summation of X-series' deviation.

Σdy is summation of Y-series.

N is number of observations in pairs.

  1. Step Deviation Method Which is Expressed as -

\[r = \frac{\Sigma dx' dy' - \frac{\Sigma dx' \times \Sigma dy'}{n}}{\sqrt{\Sigma dx'^{2} - \frac{(\Sigma dx')^{2}}{n}} \times \sqrt{\Sigma dy'^{2} - \frac{(\Sigma dy')^{2}}{n}}}\]


In this particular Karl Pearson Method,

\[dx' =\frac{dx}{C_{1}}\]

\[dy' =\frac{dy}{C_{2}}\]

C1 = Common factor for series -x

C2 = Common factor for series -y

dx is x-series’ deviation from assumed mean, where (X - A)

dy is Y-series’ deviation from assumed mean, where ( Y - A)


Σdx.dy implies summation of multiple dx and dy.

Σdx2 is summation of square of dx.

Σdy2 is summation of square of dy.

Σdx is summation of X-series' deviation.

Σdy is summation of Y-series.

N is number of observations in pairs.

Now that we have discussed all the major methods, shall we try solving a few Karl Pearson Coefficient of Correlation questions?


Task 1: Refer to table below and find out ‘r’ with the help of the provided data. Use the Actual Mean Method to solve it.


Price of Mango (Rs.)

15

25

35

40

50

65

75

Supply of Mango (units)

2

5

6

8

9

10

14


Task 2: With the help of this table below, find out ‘r’ using Karl Pearson Coefficient of Correlation Direct Method Formula.


Age of husband

21

24

27

29

31

35

38

Age of wife

19

21

25

26

29

32

34


Pro Tip: Try to solve one or two Karl Pearson coefficient of correlation problems using all the methods to figure out which is the easiest and shortest method of the lot. However, make sure to be thorough with all the formulas of Karl Pearson coefficient of correlation, so that you can attempt them in your exams with greater confidence.


Once you have solved the Karl Pearson Coefficient of Correlation sums, you will be able to understand the degree of relationship between discussed variables and relate it with reality better.


Overview of the Properties of the Coefficient of Correlation

Since we gained a fair idea about Pearson’s correlation of coefficient and have also become familiar with its question format, let’s learn about its properties as well.


In case you are wondering, “Why should I check out the properties of coefficient of correlation?” - Note that a clear idea about correlation coefficient will come in handy both during exam preparation and while solving Karl Pearson Coefficient of Correlation sums. It will help you retain every minute yet vital pointer about this ratio and would further prevent you from making any silly mistake.


That being said, let’s glance through these significant properties in brief –

  1. The Correlation Coefficient (r) does not have any unit.

  2. r with a positive value signifies that both X and Y move along the same direction.

  3. r with a negative value indicates an inverse relation between X and Y.

  4. X and Y are said to be not correlated if the value of r is 0.

  5. r with a high value signifies a strong linear relationship between two variables.

  6. r with a low value signifies a weak relationship between two variables.

  7. Correlation between two variables is said to be perfect if the value of r is either +1 or -1.

FAQ (Frequently Asked Questions)

1. How to find Karl Pearson's Coefficient Correlation?

By using this formula, one can find coefficient correlation

2. How to Calculate Karl Pearson's Coefficient of Correlation?

You can use Karl Pearson's shortcut method or the direct method to calculate r. 

3. What is the alternative method of calculating the Coefficient of Correlation?

You can use the Covariance formula to compute the value of r. The formula of the covariance method is expressed as –

4. Give Examples of Positive and Negative Correlation.

For negative correlation (0 to -1) – As the cost of flight tickets increases, it leads to a decrease in its demand. For positive correlation (0 to +1) – An increase in temperature increases the demand for soft drinks and ice cream.

5. Give an Example for Zero Correlation.

An increase in the cost of mango and an increase in demand for shirts is not related and hence is a Karl Pearson coefficient of correlation example for zero correlation.