# Introduction to Data

What is a Data?

In our everyday life, we often unconsciously see and use data. But can we tell what exactly is the concept of data and what is its importance in our life? Since the time computers are invented, people use the term “data” to refer to it as computer information that is either transmitted or stored. But we all know that this is not the only type of data instead there are many different types of data. So how do we define them?

Data can be a text or a number, it can be bytes or bits inside the memory of an electronic device. It can even be any information or ideas inside a person’s mind.

Therefore, data can be defined as a collection of information that we gather by observing, measuring, researching, or analyzing. They can usually be facts, or numbers, or names, or figures, and even descriptions of things. There are different ways to organize data such as graphs, charts, or tables. And this organization of data is done by scientists by using the method of data mining that helps us to analyze our world.

Types of Data

Data may be classified as qualitative data or quantitative data. Once we understand the difference between qualitative data and quantitative data, it will become easy for us to know how to use them and where to use them.

1. Qualitative Data: The first type of data is the qualitative data that represents characteristics or attributes. They represent descriptions that we may observe but cannot compute or calculate it. For example, data based on attributes such as intelligence, wisdom, creativity, honesty, cleanliness, etc, cannot be calculated therefore these types of data will be classified as qualitative data. They are generally more exploratory than conclusive in nature.

2. Quantitative Data: These are the data that can be calculated or measured after the observation. This is because these data include numbers that can be calculated. For example, we can find out the total number of students who play indoor games and the total number of students who play outdoor games. This information will be purely numerical and therefore, we can call it quantitative data.

Data Collection

Before collecting any data, we have to first know the problem statement i.e., why are we collecting the data? What kind of problem are we going to deal with? And then decide how to solve the problem? We must know that data collection is a systematic way of gathering relevant information from different sources.

On the basis of the source of data, we can classify data as primary data and secondary data.

1. Primary Data: The first type of data is called primary data. We use primary data when we deal with a unique problem that has no previous research related to the topic. So, primary data collection is basically a totally new collection of data that will be collected for the very first time. The basic example of primary data can be the Census of India. We can take another example. Now suppose you want information about the average time spent by the employees in a cafeteria. For such information, there will be no public data available so you will have to run a survey yourself. You can take interviews with the employees or monitor them to see how much time they spend in a cafeteria.

2. Secondary Data: The second type of data is called secondary data. These are the data that have previous research information i.e., someone or many has already researched the topic and has the data posted on the internet, articles, magazines, books, and so on. For example, data available on the Government of India.

## Difference Between Primary Data And Secondary Data

 Primary Data Secondary Data New research is to be done as no previous research is available. Data will not be unique as previous research work is available. It is time-consuming It is less time-consuming. Surveys, interviews, observation are examples of primary data collection. Books, articles, magazines, internet are the examples of secondary data collection.

Quality Checking of the Data

Once we have the data ready, we have to perform a quality check of the data before analyzing it. This is an important step that we usually ignore but we have to remember that bad quality of data can be misleading and also degrade our presentation of the data. So, a quality check of the data is very essential for our representation of the data.

Exploratory Analysis of Data

After the quality checking of our data, we can finally analyze it. Analyzation helps us to become more familiar with the topic in order to extract useful insights. Ignoring this step might generate inaccurate models and we might select insignificant variables in our model.

Representation of the Data

Now, this is the most interesting part. It’s like a cherry on the top of a cake. All our effort and time that we have spent in our research depend on how we represent it. If we have all the important information but fail to represent it beautifully, our data might turn out to be a boring one even if it is very informative. That is why the representation of the data is very important. There are many ways we can represent our data such as bar graphs, pie charts, a flow chart, tables, etc are some of the examples.

Solved Examples

Example 1) In class 8 there are  25 students who are good at sports, 16 are good at art and crafts, and 9 students are good at drama. In class 9 there are 22 students who are good at sports and 31 students are good at art and craft and 5 students are good at drama. In class 10 there are 12 students who are good at sports, 8 students are good at art and craft and only 3 students are good at drama. Represent the data in a table.

Solution 1) There are 3 activities and 50 students so the table representing the distribution of student of class 8 in different activities can be represented as:

 Number of Students Class-wise Sports Art and Craft Drama Class 8 25 16 9 Class 9 22 31 5 Class 10 12 8 3

Example 2) You are doing a survey on the most favorite type of movie and you get the following result:

 Comedy Action Romance Drama Scifi 4 5 6 1 4

Represent the data in a bar graph.

Solution 2) The bar graph for the following table is: