What Is It?
Graph Distance in Explore represents 3 different algorithms which calculate the centrality of networks. These centrality metrics define the importance of nodes based on their position in the network.
Why Is This Important?
Centrality metrics solve some of the most important problems in network analysis. Some examples include:
- identifying the most influential person(s) in a social network
- key infrastructure nodes in the Internet
- superspreaders of a virus
Consider this example involving the Spotify Artists sample dataset. We can quickly identify a few artists such as Drake, Rihanna, and Red Hot Chili Peppers that are widely listened to across many different genres due to their high Betweenness Centrality:
Graph Distance will generate three network centrality metrics that are based on Shortest Paths of all the nodes in the network:
- Betweenness Centrality - a metric that defines how often a node is found on a path between two nodes
- Closeness Centrality - a metric that defines how close a node is to other nodes
- Eccentricity - a measure that defines how far or “eccentric” a node is from other nodes
Steps for running Graph Distance on a Network Graph:
- Right-click anywhere in the plot (not on a node) and hover over Network Analysis Tools, then select Graph Distance.
- The Betweenness Centrality, Closeness Centrality and Eccentricity will all be calculated.
- The Betweenness Centrality will automatically applied to the Color dimension.
- The Closeness Centrality and Eccentricity will be added to the Features panel under “Graphs”.
Closeness Centrality is a metric computed for each node that represents the average path length from the given node to all other nodes. The value will range between 0 and 1. A value of 1 implies that this node was the closest (on average) to all other nodes along paths in the network. Nodes with high Closeness Centrality will typically be found near the center of the network visualization.
Betweenness Centrality is a metric computed for each node that represents the number of computed shortest paths that go through a given node. The value will range between 0 and 1. If a node has a betweenness centrality of 1, it means that the node lies along more shortest paths through the network than any other node. Nodes with high betweenness centrality frequently fall between communities and form bridges between major components of the network.
Eccentricity is a metric computed for each node that represents, of all the shortest paths computed for a node, the largest distance from that node to any other node. The value is always a positive value above 1. The nodes with large eccentricity values will be positioned along the periphery of the network visualization.