What Is It?
A Scatter Plot is a common plot type that visualizes each row in your dataset as a single point on either a 2D or 3D plot. This is most helpful to view how numerical features relate to each other.
Why Is This Important?
By visualizing relationships between numerical features, you can start to understand trends and patterns in your data. Scatter Plots can help you see things like how days in trial and number of different drug regimens affect outcomes of clinical trials, enabling researchers to reduce the cost of producing new treatments. A shipping data analyst could also identify how cost and distance traveled affect delays, allowing them to improve on-time performance. By adding even more dimensions like Color, Size, and Shape to your Scatter Plots, you can recognize very complex relationships in your data.
How?
Steps for creating a Scatter Plot:
- Drag one feature from the Features list to the X dimension and one feature to the Y dimension.
- If you click Apply, you will create a 2D Scatter Plot, the most basic option.
- You could also add another feature to the Z dimension to create a 3D Scatter Plot.
- Adding a feature to Color will help make trends in the visualization pop, and the Color dimension is usually best-suited for the feature you are most interested in.
- Additional dimensions that work well with Scatter Plots are Size, Shape, Group By, Playback, Halo, and Pulsation (each of these dimensions have their specific strengths, visit their respective documentation pages to learn more.
- If you are interested in visualizing categorical features, consider using a Histogram or Violin Plot.
Note: For very large datasets (in the millions of points), you may want to change the Shape Options to Point Cloud to improve performance.