Networks are all around us. Social media platforms like Facebook, X (formerly known as Twitter), and Instagram are digital social networks. The entire internet (which comes from the words “interconnected network”) is a very large network of computing devices (computers, phones, etc.).
The roads you travel on form a vast network of intersections (the nodes) and roads (the edges). Key infrastructures such as electricity grids, water distribution systems, finance/trade policies, and air traffic control can all be visualized with a network graph. Visualizing these networks will drive understanding across the mass of data being collected in the 21st century.
Key Terms
Nodes represent objects of interest such as people, places or things, or they can go even further to represent abstract ideas, concepts and topics.
Edges represent the relationships between nodes. Anytime that an edge is created between two nodes, this indicates that the nodes are related or interact in some way.
Edge Weight is an attribute of an edge that signifies the strength of the relationship between two nodes.
Network Graphs are a way of visualizing the relationships that exist between nodes.
Network Visualization and Structure
Virtualitics Explore’s Network Visualization and Structure determines how Nodes and Edges are spatialized and grouped within a network visualization.
Network graphs are created using relationship data that designates how certain nodes interact with one another. The visualization and spatialization of these entities and their interactions with one another is dependent on the software you use to visualize it.
When a network dataset is loaded into Virtualitics Explore, we automate the computation of ‘structure’ (community detection, 3D spatial layout, etc.) These will be represented as a few different Network features:
- 3D Spatialization - Network (1) X/Y/Z
- Community Detection - Segments
- Edges / Weighting - Degree / Weighted Degree
Each of these topics is explained more in-depth below.
3D Spatialization
In most data visualization practices, we map a column of data to a Spatial Dimension. However, in a network dataset, there is only information about how the nodes are connected. As a result, all the positioning information is relative and layout (literally, where to position each node) must be computed based on the information about the network structure. There are many known algorithms for computing network layout (also called spatialization) and the truth is that there is no “right” answer; network spatialization is an art and a science.
ForceAtlas3D Algorithm
The ForceAtlas3D Algorithm was developed by Virtualitics and is a proprietary improvement on the ForceAtlas2 algorithm. ForceAtlas3D is a force-directed layout algorithm in which nodes apply a repulsive force on each other while edges serve as an attractive spring force between pairs of nodes. Furthermore, a gravitational force is applied to the overall system to maintain a reasonable scale. These 3 forces (repulsive, attractive, and gravitational) are computed and applied iteratively until the network reaches a stable configuration. ForceAtlas3D leverages GPU technology to get a substantial speedup in run-time.
When the algorithm has finished running, we generate 3 columns: Network X, Network Y, Network Z which are automatically mapped to the X, Y, and Z-axes so that the network can be rendered in 3D immediately.
Community Detection
Community Detection is an essential tool in analyzing networks; by default, we place Community Detection results (called Segments) on Color. Segments are crucial in that they partition the network into a human-interpretable number of elements. There are many ways to partition the network into segments, but some methods are more efficient than others.
Virtualitics uses a hybrid of Markov Clustering and Louvain Modularity to rapidly produce community detection results. Ultimately, this hybrid approach ensures stable, reasonable, and quick results that can be used to enable the rest of the analytical tools in Virtualitics Explore.
Louvain Modularity
Modularity is a metric (between -1 and 1) that measures how well a network has been partitioned into modules or communities. The Louvain Modularity Algorithm is a community detection algorithm that aims to iteratively increase modularity. The algorithm works by initially placing each node in a unique community and then iteratively moving nodes from their current community to the neighboring community that would create the biggest increase in modularity. This algorithm is dependent on the structure of the network, and not based on the layout of the nodes in the visualization.
This algorithm is a hierarchal partitioning algorithm; as a result, the initial steps can be slow in reducing the total number of communities but is quick in the final steps for determining the final labels.
Markov Clustering
The Markov Clustering Algorithm leverages linear algebra operations to identify clusters/communities of nodes that have a heavy level of interconnectivity. This algorithm uses a weighted adjacency matrix representation of the network data and alternatively multiplies the matrix with itself and subsequently prunes weaker edges from the matrix. After several iterations, the community structures emerge.
This algorithm quickly reduces the size of the network (in terms of the number of communities) but can take a long time to stabilize on final community labels (each iteration has diminishing returns).
Virtualitics Hybrid Community Detection
The Virtualitics Community Detection algorithm uses the algorithms for what they are most effective at doing. Specifically, we use the Markov Clustering algorithm to rapidly limit the size of the network: we compute the initial community labels based on the results from the first few iterations of the Markov Clustering algorithm. Then we pass the initial labels to our Louvain Modularity algorithm to handle the fine-tuning of the final community detection results, which are stored in the Segments column.
The minimum number of segments is 1 and the maximum number of segments is the same as the total number of nodes in the network.
Edges/Weighting
Networks typically have significantly more edges than nodes; this makes sense given that a node can have several edges connected to it.
In fact, it is common to see 10-100 times more edges than nodes. As a result, edges can sometimes block or obstruct the view of nodes, which are crucial to the analysis. This issue is called occlusion and is addressed in Virtualitics Explore by our Edge Sampling options. Virtualitics Explore uses a heuristic model to show a subset of edges while still providing users with a view of the overall network structure.
Next Article |