Suppose there was a sales log dataset, stored in a very familiar tabular format, that contained the username, university, pet type, and spending amount (in $USD) for each sales transaction for your pet store business. This data could be a 3-partite network graph where the usernames are represented by blue nodes, universities are represented by orange nodes, and pet types are represented by green nodes.
There is a business utility (marketing, promotions, distribution, etc.) in understanding the customer segments based on transactional trends. This can be achieved by generating a network graph of the customer nodes. Of course, as we have seen in the previous sections, the idea of N-partite projection can be used to generate the desired network. The final figure shows how we can easily extract a network graph from this very standard-looking sales dataset.
VIP implements the concepts behind N-partite projection; the following section maps the abstract idea of N-partite projection to the Network Extractor tool in VIP.
Network Extractor in VIP
After loading a standard tabular dataset from a file, from a database connection, or through the Virtualitics API, users can now leverage the Network Extractor to compute an N-partite projection of the data and, ultimately, visualize a network graph representation of the dataset in VIP.
The network extractor in VIP requires users to specify which column of data to treat as the source of nodes; we refer to this column as the “node column.” Furthermore, users will select a set of categorical columns from the dataset to use as ‘associative columns.’ Each of the associative columns serves as another set of values with which the node set may form a bipartite graph (node and associative value occurring in the same row of data implying an edge should exist between the two). At least one associative column must be provided.
The network extractor can also be accessed using the Python API, as detailed in the notebook at the bottom of our Example Notebooks page.