Covariance Formula

What is Covariance in Statistics?

Let’s know what is covariance formula in statistics. In mathematics as well as in statistics, covariance is a measure of the relationship between two random variables in certain problems. This evaluates how much and to what extent the variables change together. Covariance can be defined as a measure of how much two random variables vary together. The concept of covariance is almost similar to variance, but where variance just tells you how a single variable varies, covariance tells you how two variables vary together.

Therefore, it is essentially a measure of the variance between two given variables and also note that the variance of one variable equals the variance of the other variable.

Now we know what is covariance in statistics. This variance we discussed can take any positive or negative values. The values are interpreted as follows:

  • Positive Covariance: It indicates that two variables will tend to move in the same direction.

  • Negative Covariance: It indicates that two variables will tend to move in inverse directions.

Covariance Formula in Statistics

Definition: Suppose X and Y are random variables with means µXand µY. The covariance of X and Y is defined as  Cov(x,y) = \[\frac{\sum_{i=1}^{n}(x_{i} - \bar{x})(y_{i} - \bar{y})}{n-1}\] , where, 

xi= the values of the X- variable

yi= the values of the X- variable

x = Mean or the average of the X variable

y = Mean or the average of the Y variable

N = Number of datapoints

Cov(x,y) = Covariance of variables x and y

In this Covariance formula in statistics, we can see that the covariance of the two variables x and y is equal to the sum of the products of the differences of each value and the mean of its variables and finally divided by one less than the total number of data points. The x and y with a bar on the represent the means of each variable.

Properties of Covariance 

  1.  Cov[X, c] = 0 for any constant c. 

  2. Cov[aX, Y ] = a · Cov[X, Y ] Cov[X, aY ] = a · Cov[X, Y ] 

  3. Cov[X, Y ] = Cov[Y, X] 

  4. Cov[X, X] = Var[X] 

  5. Bilinearity (a.k.a. distributive property): Cov[X + Y, Z] = Cov[X, Z] + Cov[Y, Z] Cov[X, Y + Z] = Cov[X, Y ] + Cov[X, Z]

[Image to be added soon]

What is the Formula for Covariance? (Population and Sample Covariance Formula)

Population  Formula for Covariance

Cov(x,y) = \[\frac{\sum(x_{i} - \bar{x})(y_{i} - \bar{y})}{N}\] 

Sample Covariance Formula 

Cov(x,y) = \[\frac{\sum(x_{i} - \bar{x})(y_{i} - \bar{y})}{N-1}\] 


Notations in the Formula for Covariance

  • xi= data value of x

  • yi = data value of y

  • x̄ = mean of x

  • ȳ = mean of y

  • N = number of data values.

Key Takeaways( Covariance in Finance)

  • Covariance is known to be a statistical tool that can be used to determine the relationship between the movement of any two asset prices.

  • When two stocks tend to move together, then they are seen as having a positive covariance; when they move inversely, the covariance is basically negative.

  • Covariance can be a significant tool in modern portfolio theory used to ascertain what securities to put in a portfolio.

  • Risk and volatility can also be reduced in a portfolio by pairing assets that have a negative covariance.

Questions to be Solved

Q1: Compute the value of covariance i.e Cov(x,y) for the given data set.

x

98

87

90

85

95

75

y

15

12

10

10

16

7


Solution: First, Let’s find the mean of each variable. We know the formula for covariance

Thus,x̄ =(98+87+90+85+95+75)/6 = 88. 33

ȳ =(15+12+10+10+16+7)/6 = 11.67

Now, we subtract each value from its respective mean and then multiply these new values together.

(x- )

(y- ȳ)

Product of Both

98-88.33 = 9.67

15-11.67= 3.33

32.20

87-88.33 = -1.33

12-11.67= 0.33

-0.44

90-88.33=1.67

10-11.67=-1.67

-2.79

85-88.33 = -3.33

10-11.67=-1.67

5.56

95-88.33= 6.67

16-11.67 = 4.33

28.88

75-88.33= -13.33

7-11.67=-4.67

62.25


The next step is to add all the products together, which is 125.66.

Now, divide the above value by (n-1) i.e by (6 – 1) i.e. 5.

Therefore, Cov(x,y) = 125.66/5 = 25.132

FAQ (Frequently Asked Questions)

Question 1)What is Covariance Used for?

Answer)Covariance is a measure of how changes in one variable are associated with changes in a second variable. Covariance specifically measures the degree to which two variables are linearly associated. However, the concept of covariance is also often used informally as a general measure of how monotonically related two variables are.

Question 2)What is the difference between Covariance and Correlation?

Answer)In simple words, both terms measure the relationship and the dependency between two variables. The term “Covariance” generally indicates the direction of the linear relationship between any given variables, suppose x and y. Whereas “Correlation” is known to measure both the strength and direction of the linear relationship between any two variables.

Question 3)What does a Negative Covariance Mean?

Answer)Negative covariance is an indication that the movement in one variable is opposite to the movement of the other variable.

Question 4)Can the Covariance be Greater than 1?

Answer)Covariance values are not standardized. Therefore, the covariance can range from negative infinity to positive infinity, this is the range or limit. Thus, the value for a perfect linear relationship depends on the data values. Because the data are not standardized, it is difficult to determine the strength of the relationship between the variables.

Question 5)Why is Correlation Preferred Over Covariance?

Answer)Now, when it comes to making a choice, which is a better measure of the relationship between two variables, correlation is preferred over covariance, because it remains unaffected by the change in location and scale, and can be used to make a comparison between two pairs of variables too.