What Is It?
PageRank is a network analysis algorithm used to provide a ranking of nodes in terms of importance or influence across a network graph.
Why Is This Important?
This algorithm empowers users to understand which nodes in their network are the most popular. Consider this example involving the Airport Traffic sample dataset. PageRank ranks all nodes in the network based on their popularity with one another, enabling the user to immediately find the biggest hubs across North America.
The output of the PageRank algorithm is a numeric value between 0 and 1 for each node. The algorithm will add a column to the existing dataset and that column can be mapped to any dimension you desire.
Steps for running PageRank on a Network Graph:
- Right-click anywhere in the plot (not on a node) and hover over Network Analysis Tools, then select PageRank.
- The PageRank will be calculated and automatically applied to the Color dimension.
PageRank is an algorithm that was originally developed by the founders of Google as a way of ranking web pages in terms of importance and influence across the internet.
The PageRank algorithm works iteratively. Initially, all nodes in the network are assigned an equal amount of PageRank. In each iteration, each node will equally distribute its current PageRank with its neighbors (if edges are weighted, more PageRank will be shared with neighbors that share a larger edge weight). After several iterations, this provides a metric where nodes with high PageRank are generally connected to other nodes with high PageRank.
Network analytics metrics tend to follow a power distribution, which is why we show color with a gradient and automatically normalize the Color dimension to get a nice spread of colors.