What is a scatter plot and how is it used in Six Sigma?
A scatter plot is also known as a scatter diagram or scatter graph.It is a graph used to visually display and compare a possible relationship between two or more sets of related data by displaying points, each having a coordinate on a horizontal and a vertical axis. A dot in the body of the chart represents the intersection of the data on the x and y axis.
One advantage of a scatter plot is that it does not require a user to specify dependent or independent variables. Either type of variable can be plotted on either axis. Scatter plots do not imply any causation, but rather an association between two variables.Your scatter plot may show that a relationship exists, but it does not and cannot prove that one variable is causing the other. There could be a third factor involved which is causing both, some other systemic cause, or the apparent relationship could just be a fluke. Nevertheless, the scatter plot can give you a clue that two things might be related, and if so, how they move together.The two axis can generally be displayed as:
- Vertical axis: variable Y-- the response variable
- Horizontal axis: variable X--some variable suspected may be related to the response
A scatter plot can show three basic relationships.These are positive (rising), negative (falling), and no relationship (varied). However, there are many variations of these, including:
1. No relationship
2. Strong linear (positive correlation)
3. Strong linear (negative correlation)
4. Exact linear (positive correlation)
5. Quadratic relationship
6. Exponential relationship
7. Sinusoidal relationship (damped)
8. Variation of Y doesn't depend on X (homoscedastic)
9. Variation of Y does depend on X (heteroscedastic)
If the points cluster in a band running from lower left to upper right, there is a positive correlation (if x increases, y increases). If the points cluster in a band from upper left to lower right, there is a negative correlation (if x increases, y decreases). Imagine drawing a straight line or curve through the data so that it "fits" as well as possible. The more the points cluster closely around the imaginary line of best fit, the stronger the relationship that exists between the two variables. If it is hard to see where you would draw a line, and if the points show no significant clustering, there is probably no correlation.
The scatter plot is one of the seven basic tools of quality control such as Six Sigma.These include the histogram, Pareto chart, check sheet, control chart, cause-and-effect diagram and flowchart.
Scatter plots can provide answers to the following questions:
1. Are variables X and Y related?
2. Are variables X and Y linearly related?
3. Are variables X and Y non-linearly related?
4. Does the variation in Y change depending on X?
5. Are there outliers?