Frequency Distribution

Bookmark added to your notes.
View Notes
×

Frequency Series VS Frequency Distribution

Before jumping to frequency distribution, let us first understand what is frequency. Frequency refers to a measure of how often something has happened. The frequency of any observation tells you the repeated number of times a specific observation occurs in the observed data. Tables can show both qualitative and quantitative variables; qualitative variables are also known as categorical and represent different non-measurable categories like eye colour, brands, etc., while quantitative variables are numeric. 

In a frequency distribution, we use class intervals to represent a range of values in the data under consideration. The intervals are framed concerning the minimum and maximum value between certain thresholds. A major difference between a frequency distribution series and a frequency distribution table is that most often in a frequency distribution series, the x-variable is discrete numeric, whereas, in a frequency distribution table, it is used for continuous values. 

The different types of frequency distributions are ungrouped frequency distributions, grouped frequency distributions, cumulative frequency distributions, and relative frequency distributions.


Grouped Frequency Distribution

Sometimes to make deriving insights from an observation easily, we group them into class intervals. 

  • Calculate the maximum and minimum value of the data set

  • Divide this range by the number of groups you intend to have in your analysis

  • Segregate the data within this small sub-group basis the class width

  • Calculate the frequency of data within each group

(Image to be added soon)


Ungrouped Frequency Distribution

The ungrouped cumulative distribution is similar to grouped frequency distribution except for the fact that class intervals are not created, and values are ordered from minimum to maximum. 

  • List the unique values as the first column.

  • Calculate the repeated instances of each unique value and record it

(Image to be added soon)


Cumulative Frequency Distribution

When you add or subtract the frequencies of all the previous class intervals to determine the frequency of a particular class interval, it results in a cumulative frequency distribution. Also, another major difference is that class intervals do not denote a range but instead represent a logical conclusion like greater than a threshold value or less than a threshold value. 

  • Calculate frequencies for every category

  • Arrange in ascending or descending order according to categories/class intervals based on whether one wants to prepare an increasing/decreasing cumulative frequency distribution

  • Total all the preceding frequencies. E.g., the second category's frequency is calculated by the sum of the first and second category's individual frequencies. Third is calculated by the sum of the first, second, third category's individual frequencies 

(Image to be added soon)


Relative Frequency Distribution

A relative frequency distribution is extensively used in our day-to-day statistical applications, which refers to the proportion of total observations associated with each category. It is calculated for individual class intervals by dividing them by the total observed frequencies. Relative frequencies can be written as a percentage, fraction, or decimal points. Cumulative relative frequency is the total of all preceding relative frequencies. To find the cumulative relative frequency, total all the previous relative frequencies till the current category.

(Image to be added soon)


Common Representations of Frequency Distributions

The most common way in which a frequency distribution is visualized is using a bar chart. People also use pie charts for their data analysis of frequency distributions. The major advantage of these representations is that one can get a clear idea of the distribution with a glance. However, the disadvantage is that there is a chance of outliers getting lost in these representations if we are not careful. In the real world, analysts commonly use frequency distributions to identify how data is skewed and where the focus should lie on. 


Solved Examples

A research was done in 20 homes in Chennai Avadi. People were asked how many bikes did they own? The results were:1, 4, 3, 0, 5, 1, 2, 2, 1, 5, 2, 3, 2, 2, 0, 1, 2, 0, 3, 2.

Present this data in Frequency Distribution Table. Also, find the maximum number of homes owning the same number of bikes.

Solution: Divide the number of bikes in every home into different intervals. Every house can own either 0,1,2,3, etc. bikes. All these numbers form the rows. Now calculate the number of homes having {0,1,2,3, etc.} bikes. This is called the frequency. When you plot this in the form of a table: 

Number of Bikes

Frequency

0

3

1

4

2

6

3

3

4

2

5

2


It can be seen from the table that 6 homes have 2 bikes and a lesser number of people own other numbers of bikes. Hence the answer is 6 homes.


Did You Know?

Toyota used Frequency Distributions for its famous Assembly line manufacturing and discovery of a lean process. 

Many noted automobile manufacturers use this method to identify the root cause of machine failure. Using this method, all possible causes of the frequency of failure of each of this cause was plotted. By this, we can identify which reason is the highest contributor to machine failure, and immediate actions can be taken to resolve it.

FAQ (Frequently Asked Questions)

1. What is the Frequency Distribution?

Frequency is a count of repetition of an observed occurrence of data. All frequencies of different categories when plotted in a table or chart are called a frequency distribution. It enables us to understand the distribution of data and the highest contributor holding majority frequency. 

Frequency distribution, when plotted as a table, displays the outcome of distribution and is commonly used in statistics. It is known as the frequency distribution table. It can also be plotted as charts, the most common being bar charts, and pie charts. The variables used can be either categorical, meaning, qualitative variables, or numeric. In numeric, it can be either discrete or continuous variables.

2. How to draw a frequency distribution table?

The first step is to choose the number of class intervals the table should have. The thumb rule is to have between 5 and 20 intervals. A more statistical approach is to use a frequency distribution formula for determining the number of class intervals, which is log observations log2. The answer needs to be rounded up to the next integer.


Write the intervals (values or range of values) in ascending or descending order. When using a range of values, it becomes a grouped frequency distribution table and, in the other case, remains an ungrouped frequency distribution table. Count the frequency of repetition of each observation and note it down. This gives the frequency distribution table.