Courses
Courses for Kids
Free study material
Offline Centres
More
Store Icon
Store
seo-qna
SearchIcon
banner

$n = 25$, \[\sum x = 125\], \[{\sum x ^2} = 650\], \[\sum y = 100\], \[\sum {{y^2}} = 460\], \[\sum {xy} = 508\]. It was observed that two pairs of values of $\left( {x,y} \right)$ were copied as $\left( {6,14} \right)$and $\left( {8,6} \right)$ instead of $\left( {8,12} \right)$ and $\left( {6,8} \right)$. The correct correlation coefficient is
A) $0.667$
B) $0.87$
C) $ - 0.25$
D) $0.356$

Answer
VerifiedVerified
539.7k+ views
Hint:
Here, we will first find the correct summation by subtracting the wrong copied values and adding the correct ones. Then, by substituting the correct values in the formula of the Pearson correlation coefficient, we will be able to find our required answer.

Formula Used:
$r = \dfrac{{n\sum {xy} - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {\left[ {n{{\sum x }^2} - {{\left( {\sum x } \right)}^2}} \right]\left[ {n{{\sum y }^2} - {{\left( {\sum y } \right)}^2}} \right]} }}$, where $r$ is the correct correlation coefficient.

Complete step by step solution:
According to the question,
We are given the total number of observations, i.e. $n = 25$
And, we are given various summations like \[\sum x = 125\], \[{\sum x ^2} = 650\], etc.
Now, it is given that it was observed that two pairs of values of $\left( {x,y} \right)$ were copied as $\left( {6,14} \right)$ and $\left( {8,6} \right)$ instead of $\left( {8,12} \right)$ and $\left( {6,8} \right)$.
Hence, we will subtract the wrong values and add the correct ones as shown below:
Hence,
Corrected \[\sum x = 125 - 6 - 8 + 8 + 6 = 125\]
Corrected \[{\sum x ^2} = 650 - {\left( 6 \right)^2} - {\left( 8 \right)^2} + {\left( 8 \right)^2} + {\left( 6 \right)^2} = 650\]
Corrected \[\sum y = 100 - 14 - 6 + 12 + 8 = 100\]
Corrected \[\sum {{y^2}} = 460 - {\left( {14} \right)^2} - {\left( 6 \right)^2} + {\left( {12} \right)^2} + {\left( 8 \right)^2} = 460 - 196 - 36 + 144 + 64\]
$ \Rightarrow $ Corrected \[\sum {{y^2}} = 436\]
Corrected \[\sum {xy} = 508 - \left( 6 \right)\left( {14} \right) - \left( 8 \right)\left( 6 \right) + \left( 8 \right)\left( {12} \right) + \left( 6 \right)\left( 8 \right)\]
$ \Rightarrow $ Corrected \[\sum {xy} = 508 - 84 - 48 + 96 + 48 = 520\]
Now, according to the Pearson Product Moment Correlation formula, we know that,
$r = \dfrac{{n\sum {xy} - \left( {\sum x } \right)\left( {\sum y } \right)}}{{\sqrt {\left[ {n{{\sum x }^2} - {{\left( {\sum x } \right)}^2}} \right]\left[ {n{{\sum y }^2} - {{\left( {\sum y } \right)}^2}} \right]} }}$
Where:
$n = $ The number of pairs of scores
$\sum xy = $ The sum of the products of paired scores
$\sum x = $ The sum of x scores
$\sum {y = } $ The sum of y scores
$\sum {{x^2} = } $ The sum of squared x scores
$\sum {{y^2} = } $ The sum of squared y scores
Here, substituting the corrected values in this formula, we get,
$r = \dfrac{{25\left( {520} \right) - \left( {125} \right)\left( {100} \right)}}{{\sqrt {\left[ {25\left( {650} \right) - {{\left( {125} \right)}^2}} \right]\left[ {25\left( {436} \right) - {{\left( {100} \right)}^2}} \right]} }}$
$ \Rightarrow r = \dfrac{{13000 - 12500}}{{\sqrt {\left[ {16250 - 15625} \right]\left[ {10900 - 10000} \right]} }}$
Solving further, we get
$ \Rightarrow r = \dfrac{{500}}{{\sqrt {\left[ {625} \right]\left[ {900} \right]} }}$
Now, taking the square root of the terms present in the denominator,
$ \Rightarrow r = \dfrac{{500}}{{25 \times 30}}$
$ \Rightarrow r = \dfrac{{50}}{{25 \times 3}} = \dfrac{2}{3}$
Now, converting this fraction into decimal, we get,
$ \Rightarrow r = \dfrac{2}{3} = 0.667$
Therefore, the correct correlation coefficient is $0.667$.

Hence, option A is the correct answer.

Note:
The Pearson Coefficient Correlation looks at the relationship between two variables. The Pearson correlation coefficient or the Pearson coefficient correlation $r$ determines the strength of the linear relationship between two variables. The stronger the association between the two variables, the closer our answer will incline towards 1 or $ - 1$. Attaining values of 1 or $ - 1$ signifies that all the data points are plotted on a straight line. It means that the change in factors of any variable does not weaken the correlation with the other variables. The closer our answer lies near 0, the more the variation in the variables.