What Is It?
Network Graphs (sometimes referred to as networks or graphs) are a way of depicting entities and the relationships that exist between them.
Why Is This Important?
Networks are all around us. Social media platforms like Facebook, Twitter, and Instagram are digital social networks. The entire internet (which comes from the words “interconnected network”) is a very large network of computing devices (computers, phones, etc.). The roads you travel on form a vast network of intersections (the nodes) and roads (the edges). Key infrastructures such as electricity grids, water distribution systems, finance/trade policies, and air traffic control can all be visualized with a network graph. Visualizing these networks will drive understanding across the mass of data being collected in the 21st century.
Nodes represent things; they can represent anything from tangible people, places, and objects to completely abstract ideas, concepts, and topics. Edges represent the relationships between nodes. Anytime that an edge is created between two nodes, this indicates that the nodes are related or interact in some way. Edge weight is an attribute of an edge that signifies the strength of the relationship between two nodes.
In the image above, we can visualize airport traffic in the US where nodes represent individual airports and edges represent possible flight routes. You will notice that a few points are much larger than the others:
- Hartsfield Jackson Atlanta International Airport
- Chicago O'Hare International Airport
- Denver International Airport
These points are much larger because they are large airports that function as key flight hubs. Hence, these airports have many flight routes and a higher number of edges in the network. Nodes with more edges respectively have a higher Weighted Degree which is represented by the size of the data points.
Steps to load a Network Graph into Explore:
- Navigate to the file you are interested in loading in from the “Local” button on the left
- After loading in the data, a data preview pane will be presented. Click Load at the bottom of the screen.
- Depending on the network data format, a dataset preview may not be shown
- A spreadsheet showing your data will be displayed. Click Show at the bottom of the screen.
- Your network data will automatically be displayed in 3D. You will find a few key network features on the axes in the Mapping panel:
- X - Network X
- Y - Network Y
- Z - Network Z
- These three features provide the spatial dimensions for our network to be visualized in 3D
- Color - Louvain Community
- The Louvain Community Represents groupings based on the density of edges between nodes
- Size - Weighted Degree
- Represents the number of edges connected to a node, factoring in any edge weights in the input data.
What Should I Do Next?
- Analyze the communities within your network, what trends do you find between the Louvain Communities?
- To better understand what characteristics make up each community, use Explainable AI to generate some explanations.
- Analyze the nodes in your network with the highest Weighted Degree, does this agree with your understanding of the data?
- Try exploring some Network Analysis Tools or Network Insights to understand what insights exist in your network!
Supported Data Formats
JSON File (.json)
A JSON file is the most powerful of these options. It contains data stored as attribute–value pairs and arrays. JSON files importantly enable analysis of node attributes. This data format will automatically be recognized by Explore when loaded in as long as the file is structured in this way:
Edge List (.csv)
Edge lists are a common data format used for low-level networks without node attributes. These can be loaded in from a tabular format (such as a .csv) by specifying “Edge List” in the Data Format drop-down so long as the data contains only a source, target, and weight column (optional), as shown below.
Make sure that the Edge List option is selected for Data Format so that this data is loaded in properly:
Adjacency Matrix (.csv)
An adjacency matrix can be loaded from any kind of tabular data file (such as .csv or excel). It must follow the format below, where columns represent the nodes and each row represents the nodes' interactions with other nodes in a network. Adjacency matrices can be weighted or un-weighted.
To load in this network data format properly, select the adjacency matrix in the dataset preview screen: