# Standard Deviation

View Notes

## What is Standard Deviation?

Standard deviation is the measurement of the dispersion of the data set from its mean value. It is always measured in arithmetic value. Standard deviation is always positive and is denoted by σ (sigma). Standard Deviation is very accurate and is preferred from other measures of dispersion.

The Standard Deviation is calculated as The square-root of variance by determining each data-point's deviation relative to the arithmetic mean. In case the data-points are far from the mean, it denotes a higher deviation within the set of data. Hence, it indicates more spread-out the data, the higher is the standard deviation. The formula to calculate Standard Deviation is:

s = $\sqrt{\frac{\sum (x_{i}-\overline{x})^{2}}{n-1}}$

where:

x(i) = value of the i’th point in the set of data

x(bar) = the mean-value of the set of data

n = the number of data-points in the set of data

​                     (Image to be added soon)

### Properties of Standard Deviation

• Standard Deviation is only used in measuring dispersion or spread around the mean value of the data set.

• Standard deviation is always in positive value.

• It determines the dispersion or variation that exists from the average value.

• Standard deviation is a very sensitive outlier. Any single outlier can distort the picture of dispersion.

• For the data set with an approximately same mean value, the greater the dispersion or spread, the greater the Standard deviation.

• Standard deviation is zero when the values of a particular data set are the same.

• While analyzing the normally distributed data, the Standard Deviation is used in conjunction along with the mean to calculate the data intervals.

If $\overline{x}$ = mean, S = Standard Deviation, and x = Value in the Data set, then

around 68% of the Data is in the interval:-  $\overline{x}$  - S < x < mean + S.

around 95% of the Data is in the interval:- $\overline{x}$  - 2S < x <mean + 2S.

around 99% of the Data is in the interval:- $\overline{x}$  - 3S < x < mean + 3S.

### Standard Deviation Calculation

Before calculating the Standard Deviation, it is essential to underline the three types of data distribution. These are:

1. #### Individual Series

A single column denoting the observation is available here.

 Score 28 34 48 69 73 78 84 89 93

1. #### Discrete Series

Two columns represent different data. One column shows the observation, while the other column is for frequency corresponding to the observation column.

 Marks 30 40 50 60 70 80 Number of Students 5 6 4 9 10 8

1. #### Frequency Distribution

It has two columns, one representing the observations, and the other is corresponding frequency.

Here the observations are classified further into intervals or classes.

 Age 20-30 30-40 40-50 50-60 60-70 70-80 Number of People 34 45 30 20 15 10

### Sigma for Individual Series

Three methods can calculate the Standard deviation for individual series; these are:

#### A Direct Method to Calculate Standard Deviation

Use the formula ∑X/N to calculate the arithmetic mean. After this, we calculate the deviations of all the observations from the mean value using the formula D= X-mean.

Now, the deviations, x, are squared and summed. The resultant value is then divided by the total number of observations. The square root of the above-derived value = Standard deviation

The formula is - σ = √[∑D²/N]

Here, D = deviation of an item that is relative to mean. It is calculated as D = X- mean.

N = Number of observations

#### Short-cut Method

In this method, any random value is assumed to calculate deviation. It is believed that the assumed value is in the Middle of the Range of Values. The short cut method is derived using the formula;

σ = √[(∑D²/N) – (∑D/N)²]

#### Step-Deviation Method

It is a simple form of the short-cut method. Here, we select a common factor C, among the deviations. All the deviation values reduce when divided by C, simplifying the calculations. The formula is;

Standard deviation D (σ)= √[(∑D’²/N) – (∑D’/N)²]  × C

D'= step-deviation of Observations relative to an Assumed mean. It is calculated as D'= (X-A)/C

C= Common Factor chosen.

### Sigma For Discrete Series

There are two ways to calculate Standard Deviation in discrete series, theses are:

#### Direct Method

We know that in the discrete series, another frequency column is added; the direct method formula to calculate SD is:

Standard deviation (σ) = √(∑fD²)/N)

#### Short-Cut Method

Standard deviation (σ) = √[(∑fD²/N) – (∑fD/N)²]

### Sigma For Frequency Distribution

Three different methods can be used to calculate standard deviation in frequency distribution series; these methods are:

#### Direct Method

The direct method employed to derive standard deviation in a frequency distribution is very similar to the discrete series done above. The value of observation (when used) in the frequency distribution is the only difference between the two series. Here, the mid-value of the class is determined by dividing the sum' of the upper value of the class and the lower value of the class. The value thus derived is used for calculation. The formula is;

Standard Deviation (σ) = √(∑fD²)/N)

In the calculation, D = Deviation of an item that is relative to mean value and is calculated as,

D = Xi – Mean

F = frequencies corresponding to the Observations

N = The summation of the frequency.

#### Step-Deviation Method

The step - deviation method is the short cut method to determine the Standard Deviation. The formula is:

Standard Deviation (σ) = √[(∑fD’²/N) – (∑fD’/N)²]  ×  C

In the above calculation, D'= Step-Deviation of the observations relative to the assumed value. It is calculated as-  D'= (Xi-A)/C

N = The Summation of Frequency.

C = Common Factor chosen

### Did You Know?

Without Standard Deviation D, one can't compare two sets of data effectively. Suppose there are two data sets having the same average. Does that imply that the sets of data are exactly the same? No. For ex. the data sets - 199, 200, 201 and other 0, 200, 400 have the same 200 average, yet they have different standard deviations. Here, the first data has a small standard deviation (s=1) in comparison to the second set of data (s=200).

1. What does SD or standard deviation indicates?

The standard deviation (SD) indicates the amount of variability on an average in your set of data. On an average, it tells us how far each score is available from the mean value. The Standard Deviation is the most accurate measurement compared to other dispersion measures available and can never be negative. The symbol Sigma or σ denotes standard deviation.

In normal distributions, a higher standard deviation implies that the values are further away from the mean. Similarly, a lower standard deviation means the values are clustered very close to arithmetic mean value.

2. What is the difference between the variance and the standard deviation?

The difference between the standard deviation and the variance is as follows:

Variance means the average squared deviations that are measured from the mean, whereas Standard Deviation is calculated as the Square root of this number. Although both the measurements indicate variability in the distribution, however, their units differ:

• The standard deviation (SD) is expressed as the same unit that is available in the original value (example - meters, grams, or minutes)

• The variance is denoted in larger units in comparison, such as a square meter.

Although the units measured of variance are a little difficult to understand initially, the variance is significant in the statistical test.