What is a scatter plot and how is it used in Six Sigma?
A scatter plot is also known as a scatter diagram or scatter graph.It is a graph used to visually display and compare a possible relationship between two or more sets of related data by displaying points, each having a coordinate on a horizontal and a vertical axis. A dot in the body of the chart represents the intersection of the data on the x and y axis.
One advantage of a scatter plot is that it does not require you to specify dependent or independent variables. Either type of variable can be plotted on either axis. Scatter plots do not imply any causation, but rather an association between two variables.Your scatter plot may show that a relationship exists, but it does not and cannot prove that one variable is causing the other.If there is an apparent relationship, there could be a third factor involved which is causing both. It could also be a result from some other systemic cause or could just be a fluke. Nevertheless, the scatter plot can give you a clue that two things might be related, and if so, how they move together.The two axis can generally be displayed as:
- Vertical axis: variable Y-- the response variable
- Horizontal axis: variable X--some variable suspected may be related to the response
A scatter plot can show three basic relationships.These are positive (rising), negative (falling), and no relationship (varied). However, there are many variations of these, including:
1. No relationship
2. Strong linear (a positive correlation)
3. Strong linear (a negative correlation)
4. Exact linear (a positive correlation)
5. Quadratic relationship
6. Exponential relationship
7. Sinusoidal relationship (damped)
8. Variation of Y doesn't depend on X (homoscedastic)
9. Variation of Y does depend on X (heteroscedastic)
If the points cluster in a band running from lower left to upper right, there is a positive correlation.This is also described as `if x increases, y increases'. If the points cluster in a band from upper left to lower right, there is a negative correlation.In other words, `if x increases, y decreases'. The best way to find a correlation is to draw an imaginary line or curve through the data where it most clusters together so that it fits as best as possible. The more the points cluster closely around the imaginary line of best fit, the stronger the relationship that exists between the two variables. If it is hard to see where you would draw a line, and if the points show no significant clustering, then there probably is not correlation.
The scatter plot is one of the seven basic tools of quality control methodologies such as Six Sigma.The other tools include the histogram, Pareto chart, check sheet, control chart, cause-and-effect diagram and flowchart.
Six Sigma uses this tool because it is found that oftentimes executives assume and/or presume that measures vary together when they do not. Sometimes they assume and/or presume that measures do not vary with one another when they actually do.Knowing how factors vary together is very important in improving forecasting accuracy. Improved forecasts can reduce decision risk.
Scatter plots can provide answers to the following questions:
1. Are variables X and Y related?
2. Are variables X and Y linearly related?
3. Are variables X and Y non-linearly related?
4. Does the variation in Y change depending on X?
5. Are there outliers?
In summary, scatter plots are a valuable tool used by many statisticians.They are very useful because of their value in showing correlations between and among different series of data.Six Sigma methodology uses scatter plots in many different processes.