Scatter plots are used to represent data points on a horizontal and a vertical axis with a view to show how much one variable is affecting another varaible. Each row in the data table is represented by a point whose position relies on its values in the columns set on the X and Y-axis.

A third variable can represent the color or size of the pointers. Hence, including another dimension to the plot.

The relationship between the two variables is known as correlation. If the points are close to each other and making a straight line in the scatter plot, then two variables are said to have a high correlation. If the points are equally distributed in the scatter plot, the correlation is said to be low, or zero. However, even though a correlation may seem to be present, this case might not always be possible. Both variables can have relation with some other variable i.e.the third variable, hence determining their variation. This coincidence might introduce an apparent correlation.

Below you can see the scatter plot example which will help you to understand the concept of scatter plot.

In the scatter plot below, sales is represented on X-axis against the cost for a number of different products which is represented on the Y- axis (colored by product), to introduce a low positive correlation.

Image will be uploaded soon

Each product given in the above scatter graph is represented separately using trellising:

Image will be uploaded soon

A scatter graph (also known as scatter plot, scatter diagram, and correlation chart) is a medium for analyzing relationships between two given variables and also it determines how closely the two variables are related to each other. The values of one variable are represented on the horizontal axis whereas the values of the other variables are represented on the vertical axis.The design of their intersecting points can graphically represent relationship patterns.

A scatter plot graph is usually used to prove or disprove cause-and-effect relationships. While the scatter plot graph represents relationships, it does not itself prove that one variable causes the other. Hence, we can use a scatter plot graph to determine theories about cause-and-effect relationships and to look for the root causes of an identified problem.

Here, you can scatter graph examples which will help you to know how to construct a scatter graph for the given variables.

Construct the scatter graph for the given pair of variables and understand the type of correlation between the pair of variables.

Solution:

Here, we are taking two variables X and Y

X represents- The marks obtained out of 100

Y represents: Total number of students

As the values of X are in the form of bins, we can use the center point of each class in the scatter diagram The data points that we will use to plot for the given dataset are

(45,12), (55,10), (65,8), (75,7), (85,5), (95,2)

The scatter plot for the above coordinates will look like as:

Image will be uploaded soon

As the number of wolves increases in the jungle, the number of deer present in the forest will be minimized. What are the two variables discussed here? Do the above data have positive, negative or, no correlation.

Solution: The two variables are the number of deers and the number of wolves. They are negatively correlated because as the value of one variable increases, the other gets decreased. Hence, the deer surely find this correlation to be negative.

For the scatter graph given below, state whether the correlation exists or not and also state the strength and the type of correlation.

Image will be uploaded soon

Solution: The scatter graph A indicates that when one variable increases another decreases and all the points lie close together in a straight line.

b. The scatter graph B is not showing any correlation because no clear pattern is formed.

c. The scatter graph C indicates that when one variable decreases another increase and all the points are neither close together nor very apart. So the graph is showing a moderate correlation.

### For the scatter plot drawn below, determine if the points are trying to form a line. If so, approximate the line of best fit.

Image will be uploaded soon

Solution: The points are not trying to form a line, so there is no line of best fit.

### If the scatter graph is drawn and the scatter points lie on a straight line then it indicates

Skewness

Perfect correlation

No correlation

None of the above

2. The value of the coefficient of correlation comes in between

0 to 1

0 to -1

-1 to 1

1 to -10

FAQ (Frequently Asked Questions)

1. When to make use of scatter graphs or scatter diagrams?

You can use a scatter graph or scatter diagram in the following situations.

When you have a paired numerical data

When your dependent variable may have various values for each value of your independent variable.

When making an effort to determine whether two variables are related such as:

When making an effort to examine the potential root causes of problems.

When examining whether two effects that occur to be related both appear with a similar clause.

When testing for autocorrelation before designing a control chart.

After brainstorming cause and effect with the help of a fishbone diagram to examine objectively whether a specific cause and effect are related.

2. What is the meaning of correlation?

Correlation is a measure of the strength of a linear relationship between two quantitative variables such as height and weight. When the series of data are strongly connected together we say that they have a high correlation.

Correlation is said to be positive when the values increase together.

Correlation is said to be negative when the values decrease together.

Negative correlation implies that there is a correlation but the one value goes down and the other value increases.

It is assumed that correlation is linear( It implies that it follows a line)..

Image will be uploaded soon

Correlation can have a value:

1 is a perfect positive correlation.

0 is no correlation ( the values are not linked at all).

-1 is a perfect negative correlation.

The value shows how good the correlation is (not how steep the line correlation is forming ), and whether the correlation is positive or negative.