For many statistical calculations, it is conjectured that the data points (i.e. the numbers in your list) are bundled around some central value. That is to say, it is believed that there is an "average" of some sort. The "box" in the box-and-whisker plot consists of, and thus highlights, the centre most of these data points.
And when we represent the data distribution in a standardized format using 5 summary – minimum, Q1 (1st Quartile), median, Q3 (3rd Quartile), and maximum, it is known as the Box plot. We also refer it to as a whisker plot when the lines stretching out from the boxes specify variability on the exterior of the upper and lower quartiles.
For the purpose of creating a box-and-whisker plot, we follow the below given steps:
Order our data (i.e., putting the values) in numerical order, if they aren't ordered already.
Identify the median of our data. The purpose of finding the median is that it splits the data into two halves. In order to split the data into quarters, we then determine the medians of these two halves.
Mathematically, if our data contains values of an even number, thereby the 1st median was the average of the 2 middle values, and then we include the middle values in the sub-median calculations. On the other hand, if our data contains values of odd numbers, so the 1st median was an actual data point, then we do not include that value in our sub-median calculations. In other words, in order to find the sub-medians, we only look out for the values that have not yet been used.
Box and Whisker Plots are basically the graphs that mathematically display the distribution of data along a number line. We can create box plots by arranging a data set in order to determine:
the median of the set of data,
median of the lower and the upper quartiles
lower and upper extremes
We can construct a box and whisker plot and use it for solving a real world problem. By identifying the middle values of the arranged data set, we have separated the data into 4 equal groups called quartiles. A shorter distance states that the quartile data is clustered together; while a longer distance states that the quartile data is extended.
Let’s say, you are given a task to compare the annual snowfall between two ski stations for the past 2 decades, you would require a way to summarize all the data. A box plot shows the range and distribution of the data set along a number line.
1. Arrange the data from lowest to highest
2. Determine the median or middle value that divides the set of data into 2 equal groups. If no middle value is there then take the average of the two middle values as the median.
3. Find the median for the upper half as well as lower half of the data set.
5. Use these five values to draw a box plot: median, upper extreme, lower extreme, lower quartile, upper quartile,
6. Plot the points of these five values above a number line.
7. Construct vertical lines across the median, lower quartile, and upper quartile.
8. Create a box by joining the vertical lines from the median, lower quartile, and upper quartile.
9. Create the whiskers beginning from the extremes to the box.
1. When can we use a box and whisker plot?
We can use box plots when we have multiple data sets from individual sources that are associated with each other in some way. Few of the examples include:
Test scores between classrooms or schools
Same features on one part, like the camshaft lobes
Data from before and after a process alteration
Data from duplicate machines producing similar products
2. What are quartiles?
We have 3 points: the ‘median’ (1st middle point), ‘sub-medians’ (middle points of two halves). These 3 points split the whole data into quarters, called "quartiles".
3. What are outliers in a whisker Box Plot?
An outlier is referred to an outlying observation that seems to deviate strikingly from other elements of the sample in which it takes place.
Interquartile Range is the distance between the 1st and 2nd quartiles
Multiply the Interquartile Range by 1.5
Deduct that value from the 1st quartile to obtain the lower boundary
Add that value to the 2nd Quartile to obtain the upper boundary
Values in the data that lie outside of these limits are termed as outliers
4. How do we comprehend & compare Box Plots And Whisker Plots?
Box and whisker plots are graphical representations of the 5 number summaries (median, quartile 1, quartile 3, minimum, and maximum).
Compare two box plots and notice how larger extensions make forecasts more complex.
Check for proof of assertion using the box plots.