Whoops…Nothing found

Try other keywords in your search

Normalization

 1 Minute

 0 Likes

 321 Views

What Is It?

Normalization transforms tightly clustered data points or reels in skewed points to create clear visualizations and aid in understanding. 


Why Is This Important?

Normalization transformations are important for creating visualizations that are clear, and allow consumers to read data quickly. For example, if you have a few outliers in your data, the significant trends in the bulk of the data might be obscured before applying a normalization. This is very common with data quality errors. Normalization is also very helpful when you have data that is expected to be heavily skewed (such as household income), and it allows the user to obtain insights from data that would otherwise be difficult to interpret.


How?

Applying Normalization can be done in 4 ways:

  1. Click on the Normalization button in the toolbar, which allows you to apply the same Normalization method to all three axes at once.
  2. From the right-click Contextual Menu, you can select Axis Normalization. This will also apply the same method to all three axes at once.
  3. Click on the desired Dimension Icon in the Mapping panel. Select a Normalization from the “Norm” drop down options, which will only apply the selected method to selected feature.
  4. Create a new normalized feature by right-clicking a numerical feature and selecting an option from the Normalize section.
    • Note that options 1-3 are visual normalizations only (data values are not being changed). Option 4 creates a new column in the dataset with normalized values.



There are 4 types of normalization that you can apply:

  • Log10 – A common normalization type, log transformation changes the feature from linear to logarithmic. Cannot be used on features that contain a value of 0 or less.
  • IHST – Inverse hyperbolic sine transformation. Similar to Softmax, but can provide slightly more space between points for certain features.
  • Softmax – Useful for normalizing variables that contain values of 0 or less, making this a robust choice.
  • CDF – Cumulative distribution function. Another common normalization type.

Was this article helpful?