Empirical Probability

Empirical Probability Advantages and Disadvantages

What is Empirical probability? To explain this let us take the word Probability, which means the number of times an event can happen. to make it more simply let us take an example say a dice. Dice is a cube in shape which has SIX faces with numbers 1 to 6 printed on each face of the cube. This means that only one face with come up each time the dice are rolled. Again, if we examine closely, out of SIX faces of the dice only one face with one number printed on it will turn up each time the dice cube is rolled. This translates to 1 (face) divided by 6 (total number of faces) which is 1/6= 0.1666. This is the chances or probability of any particular number coming up.

Now let us examine the word Empirical Probability. Again, going back to the dice, suppose we roll the dice say 120 times and want to estimate the number of times the number 6 would come up. Here from above, we know that in the case of a dice cube that when we roll the dice once, the chances that 6 will come up is 1/6. So, when we try rolling the dice 120times, the probability of the number 6 coming up is 120 x 1/6 which is equal to 20. This is the EMPIRICAL probability of the number 6 coming up when we cast the dice 120 times.

So, the Empirical probability of a particular event occurring may be stated or defined as the estimated chance of that particular event occurring in a total series of events; that is to state by expressing it in a formula it becomes
EMPIRICAL (Experimental) Probability = Number of times an event occurs (in this case the number 6 turns up) x Total number of trails (in this case the total number of times that the dice is cast =120 times.

Thus, the empirical probability of the number 6 coming up in throws of 120 times of the dice is 120 x 1/6 =20
Empirical Probability is also known as the Relative Frequency or Experimental Probability. It may be explained as follows: It is the ratio of the number of results in which a particular event takes place or happens to the number of actual trials made, not as a theoretical calculation but as per actual experimental observations. To put it concisely, the empirical probability is a prediction derived from actual experimental observation.

Suppose an event say 'A' happens 'm' times out of a total of 'n' tries or trials conducted, then the relative frequency of 'A' is m/n.
Defining the term statistically, it is the scientific prediction or estimate of a probability. Modelling using a Binomial Distribution can be carried out in simple cases where the result of an actual experimental study only determines whether a particular event has happened or not happened. If it is carried out in this manner, it is called the maximum likelihood estimate. It is termed as the Bayesian Estimate for the same case if we make certain postulations or hypotheses for the prior distribution of the probability. In case a trial gives us more results or information or data, then the empirical probability can be improved upon by assuming further data or hypotheses so as to form a statistical model. Then, if we use such a model then it can be used to derive a prediction or estimate of a particular event.

Thus, the probability may be broadly classified as 

1. Theoretical Probability and
2. Empirical Probability

Probabilities of any particular event happening are always expressed in the range of numbers 0 to 1. If the empirical probability of any particular event is zero (0), then it means the event NEVER took place or occurred, and if it is the figure ONE (1) then it means it will ALWAYS happen. Thus, if the probability figure is closer to 1, then the more is the likelihood of it happening and if the figure of the probability is closer to the figure Zero (0), then the less the likelihood of it happening. If the number of experiments or trials conducted goes on increasing, then the figures of Theoretical Probability and Experimental or Empirical probability will tend to be the same or will approach very close to each other.

Theoretical Probability and Empirical and Probability -- Pros and Cons

If we use Empirical Probability to estimate the probability, then the advantage of this method is that this is based on ACTUAL experimental studies and it is significantly free of assumed data or hypotheses.
Let us study an example to illustrate this

Say, we are required to find the probability of a population or group of men to satisfy two conditions

     (i) That they are above 6 feet in height and
     (ii) They must prefer strawberry jam instead of saying, pineapple jam

Then a DIRECT Estimate can be arrived at ONLY by actually counting the number of men satisfying BOTH conditions to arrive at the Empirical Probability of the combined condition. Alternatively, an estimate may be arrived at by multiplying the number of men who are more than 6ft in height with the proportion of men who prefer strawberry jam instead of pineapple jam. But a word of caution, this type of estimation relies on the assumed data that the two conditions are statistically independent.

What are the Disadvantages?

If we use Empirical probabilities, then the disadvantage is that it gives results pointing to estimating probabilities which are either very close to the figure Zero (0) or very close to the figure One (1). Here, it may be noted that very large sample sizes would be required in order to forecast or predict such probabilities to a fair degree of accuracy. We also see that statistical models can be of help, but then, it depends on the context, and broadly speaking, we may state that it shows better accuracy than Empirical probabilities if the assumed data are taken into consideration actually are reliable.

To understand this better, consider the situation, where, in an area, the minimum value of the daily maximum rainfall received in the summer month of May is less than 20mm. Then this probability can be arrived at from the data of earlier such recordings. Or in other words, a family of a probability distribution can be taken and fitted into the data sets of past year values. As a result, the fitted values will yield an alternative estimated value of the required probability. It must be noted that this substitute method can be relied upon to give an estimate of the probability even if all the values shown in the record are more than 20mm.

Mixed classification or Nomenclature

The words 'a-posteriori probability' is a phrase that is also used as a substitute for empirical probability or relative frequency. Its use is indicative of the terms used in Bayesian statistics, but it is not directly related to Bayesian results arrived at, where the same phrase is sometimes used to point to a posterior probability, which is completely different, even though it has a misleading similar name or indication.

The phrase 'a-posteriori probability' when taken to mean (or considered equivalent to) empirical probability may be used in combination with the words 'a priori probability', and it means an estimate of a probability which is not based on any observed and recorded data, but simply on logical reasoning or what is arrived at by deduction.

Now if you are asked to predict with reasonable logic whether in a Cricket tournament the Pakistan team or the Australian team has a better chance of winning against the Indian Cricket team, then, it must be conclusively understood or inferred that there is no reliable method to come to a conclusion with a fair degree of accuracy by comparing the probable occurrence or chances of the two events taking place. The mathematicians of the age were incredibly fascinated by this theory of probability or chance of an event happening.

Card betting games and those played on stakes and gambling inevitably brought in the entry of mathematics and its applications into the field of predicting the probability or chances of winning or losing. It is believed that once in such a situation, a gambler sought the help of the famous mathematician Pierre de Fermat to improve his chances of winning. It is but a foregone conclusion that this led to the development of the theory of probability and allied research in situations connected with chance. Mathematicians applied mathematics to predict changes or forecasting the probability of events occurring with numbers researching into aspects of probability and empirical probability. It is but truthful to mention here that a great many technologies and methods were tried to measure and predict the chances of an event happening or the success and failure in an event which yielded little rational success of predictability. It is also a fact that mathematics is the only logical and reliable science in such cases.