# What is Dispersion?

The word dispersion stands for ‘distribution’ of things over a wide area. In statistics, the extent to which the numerical data are distributed or squeezed about an average value is called dispersion. In short, it is the distribution of data. A set of data having large value is always widely scattered or tightly clustered. Example of widely scattered data - 0, 30, 60, 90, 120, … and tightly clustered data of small value - 1, 2, 2, 3, 3, 4, 4,..

Variance, standard deviation and interquartile are types of dispersion. It is also called variability, scatter, spread.

Measures of Statistical Dispersion

As we know, dispersion is a way of describing how scattered a set of data is. The measure of dispersion is always a non-negative real number that is zero if all the data are the same and increases as the data becomes more diverse. Measures of dispersion specify the homogeneity or heterogeneity of the scattered data. It also describes the variation of data from one another.

## Nature of Measures of Dispersion

• It defined rigidly and depends on all the observations.

• Understanding and calculation of measures of dispersion are easy.

• The fluctuations in the observations do not affect measures of dispersion.

Types of Measures of Dispersion

The dispersion always depends on the observations and types of measure of central tendency. Below are the types of measures of dispersion:

1) Range

2) Quartile deviation

3) Mean Deviation

4) Standard Deviation

## Range

The difference in the minimum and maximum values of each series is called range. The range gives a rough idea of how scattered a data is, but we need other measures of variability to find the dispersion of data from measures of central tendency. Let us suppose two batsmen have their minimum and maximum runs scored in a series.

Batsman A - 0 to 117

Batman B - 40 to 60

Thus, the range of Batsman A = 117-0 = 117 whereas the range of Batsman B = 60-40 = 20.

The range of Batsman A is more than Batsman B, so the data in case of Batsman A is more dispersed than Batsman B.

## Quartile Deviation

The word quartile is derived from the word quarter which means one-fourth. Quartile divides a set of data in four equal parts. In a set of data, there is always the smallest number, largest number and median. The middle number between the smallest number and the median of the data is called the first quartile, (Q1). The median of the data set is called the second quartile, (Q2). The middle number between the median and the largest number is called the third quartile, (Q3).

Quartile deviation or semi-inter-quartile deviation is

Q = ½ × (Q3 – Q1)

## Mean Deviation

Mean Deviation is the mean of all the absolute values of the differences between the numbers of a set also known as statistical data and their mean or median. Usually mean deviation is used to understand the dispersion of data from the given measures of central tendency. The mean deviation can be mean, median or mode. Although the mean deviation about mode can be easily calculated, the mean deviation about mean and median are most commonly used. Mean deviation comes as an improvement over the range and it basically measures the deviations from a value generally known as mean or median.

The Formula for Mean Deviation is :

 Mean deviation= Sum of absolute values of deviations / the number of observations

There are three types of series for which mean deviation can be found:

1. Individual Data series - when all the data are given on an individual basis.

2. Discrete Data series - when individual data is accompanied by its frequency.

3. Continuous Data series - when the data given are not on an individual basis but a range of data along with their frequencies.

Examples:

## Individual Data series

 Items 22 24 26 28 30 32 34 36

## Discrete Data series

 Items 20 40 60 80 100 120 140 160 Frequency 9 6 1 4 0 12 7 5

## Continuous Data series

 Items 0-5 5-10 10-20 20-30 30-40 Frequency 5 5 1 8 3

## Standard Deviation

The standard deviation is known to be one of the most preferred method to measure deviation as compared to the other measures of dispersion. Note that deviation is always calculated by taking the mean of the reference and it always involves positive values. Similar to mean deviation even standard deviation can be calculated for all the three types of series - Individual Data series, Discrete Data series and Continuous Data series. Standard deviation is denoted by sigma (σ ).

### Methods of Calculating Standard Deviation

There are three methods of calculating the standard deviation:

1. Direct method

This method involves the following steps: firstly, the arithmetic mean is calculated using the formula then deviations of all the observations from this mean value are calculated. In the next step, these deviations, are squared and their addition is divided by the number of observations. And finally, the square root of the above calculation results in the derivation of the standard deviation.

1. Short-cut method

In this method, any random value for calculation of deviation is assumed such that it lies around the middle of the range of values. On choosing an extreme value the deviations would be large with long calculations.

1. Step-deviation method

The step deviation is also known as an extension or the simplification of the shortcut method. It is done by selecting a common factor among deviations such that when the value is divided by this factor, all the deviation values get reduced to a simple number. This reduction makes the calculation simpler and so it is preferred more than the other two methods.

## Example 1: Using the step deviation method calculate the mean marks of the following distribution.

 Class Interval 50-55 55-60 60-65 65-70 70-75 75-80 80-85 85-90 Frequency 5 20 10 10 9 6 12 8

Solution:

 Class Interval Midpoint (x) Frequency (f) d= x-A U = d/h (h=5) fu 50-5555-6060-6565-7070-7575-8080-8585-90 52.557.562.567.5 = A72.577.282.587.5 520101096128 -15-10-505101520 -3-2-101234 -15-40-1009123632 Total 80 15

Mean = A + $\frac{{fu}}{f}$ = 68.44.