Pathfinder's Uncover Outliers feature allows you to discover data points that are very different from all the others in your dataset at the click of a button.
Uncovering outliers in your dataset using Pathfinder is as simple as clicking the Uncover Outliers button. However, you can also click View Advanced Options to make adjustments to how those outliers are identified:
- Adjust Outlier Threshold - Default is 1%. To identify outliers, Virtualitics Explore runs the Principal Component Analysis (PCA) AI routine to generate 3 principal components from the input columns. The top N% of points are then determined based on the distance of each point from the centroid of that 3 dimensional PCA space. The threshold percentage selected here adjusts the top N% of points.
- Columns to Consider - Default is all numerical columns. Here, you can remove columns to be considered for outlier detection.
Once you’ve adjusted any of these advanced options, click Uncover Outliers.
Pathfinder will populate with insights about the identified outliers including Anomaly Result Overview and Key Drivers, which are automatically generated leveraging the Anomaly Detection, Explainable AI (XAI) and Smart Mapping AI routines. A plot will also be automatically created to visualize these results.
Results
Anomaly Result Overview
Anomaly Result Overview shows the number of outliers out of the total number of reviewed data points and the Threshold percentage selected. Also included in this section are Outlier Characteristics and Non-Outlier Characteristics.
- Outlier Characteristics highlight the key columns that separate the outliers from the rest of the data points. Click the three dots ( ) to view additional details and click the Outlier Characteristics box to highlight these points on the Plot.
- Non-Outlier Characteristics highlight the key columns that connect non-outliers. Click the three dots ( ) to view additional details and click the Non-Outlier Characteristics box to highlight these points on the Plot.
Key Drivers
Key Drivers identifies which columns had the biggest impact on identifying outliers within the data. These columns are automatically sorted by Most Impactful, and the level of impact is shown as a bar chart under each column name. You can change the sorting to see the Least Impactful by clicking the dropdown under Sort By and selecting that option.
Next Steps
To continue your guided exploration using Pathfinder, select one of the Next Steps. Or, to reset your experience using Pathfinder, click the Menu icon on the top right corner of the panel and click Reset.
Previous Article |
|