Confidence Interval

What is a Confidence Interval?

Statistics is a branch of Mathematics that deals with the collection, classification and representation of data. For the students out in the world wondering “what is a confidence interval?” and “why is it used in statistics?”, this article gives you a brief overview of confidence interval definition, confidence interval formula and how to calculate confidence interval. Confidence interval is a type of interval calculation derived from the data observed. It holds the actual value of the unknown parameter. The confidence interval is linked with the confidence level in which the interval calculates the deterministic parameter. Confidence interval definition is based on Standard Normal Distribution where the value of Z is the z- score. 


Confidence Interval Definition:

A confidence level is the representation of the proportion or the frequency of the admissible confidence intervals that consist of the actual value of the unknown parameter. It can be defined the other way round that the confidence intervals can be computed using the given confidence level from a limitless level of individual samples, in such a way that the proportion of the range consists of the true value of the factor that will be identical to the confidence level. In general, confidence level is presumed prior to data examination. In most of the confidence interval examples, the confidence level chosen is 95%. However, the confidence level of 90% and 95% are also used in few confidence interval examples.


Confidence Interval Formula:

The computation of confidence intervals is completely based on mean and standard deviation of the given dataset. The formula to find confidence interval is:

CI = \[\hat{X}\] ± Z x (\[\frac{σ}{\sqrt{n}}\])

In the above equation,

 \[\hat{X}\] represents the mean of the data

Z indicates the confidence coefficient

α is the indication of the confidence level

σ is the standard deviation

n is the sample space

The value after the plus or minus sign in the formula is called the margin of error.


The confidence interval table gives the values of Z i.e. the confidence coefficient for the corresponding confidence level. The below table gives the values of confidence coefficients for the corresponding confidence level. 

Confidence Interval Table:

Confidence Level

Confidence Coefficient (Value of Z)

80%

1.282

85%

1.440

90%

1.645

95%

1.960

99%

2.576

99.5%

2.807

99.9%

3.291


How to Calculate Confidence Interval?

A series of steps is to be followed to calculate the confidence interval of a given data sample. 


Step 1:

Determine the number of observations in the given sample space denoted as ‘n’. Calculate the mean \[\hat{X}\] and standard deviation σ.


Step 2:

Presume a confidence level of either 95% or 99%. Identify the value of Z for the confidence level chosen. The confidence interval table described in the previous subsection to determine the value of Z.


Step 3:

Substitute the determined values in the confidence interval formula.

CI = \[\hat{X}\] ± Z\[\frac{∝}{2}\] x (\[\frac{σ}{\sqrt{n}}\])


Confidence Interval Examples:

  1. A tree consists of hundreds of apples. 46 apples are randomly chosen. The mean and standard deviation of this instance is found to be 86 and 6.2 respectively. Determine whether the apples are big enough or not.

Solution:

Given data:

Mean \[\hat{X}\] = 86

Standard deviation σ = 6.2 

Number of observations n = 46

Let us assume the confidence level as 95% 

The confidence coefficient from the table is determined as: Z = 1.960

The formula for confidence interval is:

CI = \[\hat{X}\] ± Z x (\[\frac{σ}{\sqrt{n}}\])

CI = 86 ± 1.960 x (\[\frac{6.2}{\sqrt{46}}\])

CI = 86 ± 1.79

The margin error in this problem is 1.79.

All the hundreds of apples are therefore likely to be in the range of 86 + 1.79 and 86 - 1.79

i.e. in the range of 84.21 and 87.79


Fun Facts about Confidence Interval Formula:

  • Confidence interval is accurate only for normal distribution of population. However, in case of large samples from other kinds of population distributions, the central limit theorem is used to determine the most accurate interval.

  • Confidence level of 95% should never be misinterpreted that 95% of the sample population lie within the confidence interval.

  • Confidence interval is not the estimation of the plausible values of the unknown parameter of the population.

  • If a confidence level is determined to be 95% for a particular experiment, it is not true that the same confidence level is obtained by repeating the experiment. 

FAQ (Frequently Asked Questions)

1. What is a Confidence Interval? What are the Factors Affecting Confidence Intervals?

Confidence interval for an unknown parameter depends on the sampling distribution of a corresponding estimator in general. The factors affecting the confidence interval are:

  • The confidence level

  • The variability of the sample

When all the other factors are equal, the estimation of a population parameter is better for a larger sample. Higher the confidence level, broader is the confidence interval. 

2. What is the Meaning of 95% Confidence Interval?

A range of values for which you can be 95% sure that it contains the true mean of the population is called the 95% confidence interval. Confidence interval becomes more narrow when it is calculated for a large number of samples. This is because the mean is more precise when calculated for a large number of values. However, the precision and accuracy of the ‘mean’ determined for a small range of data is not so accurate.