In [1]:
from virtualitics import api
import pandas as pd
In [2]:
data = pd.read_csv('../data/eCommerce.csv')

Exploring eCommerce data

In this notebook, we will analyze an eCommerce dataset. Each row of data contains information about someone that visited the website and may or may not have spent some money on the selection of products that this website hosts. Suppose we are a marketing agency and we are tasked with identifying cohorts of users to incentivize them to visit the site again.

In [3]:
vip = api.VIP()
Setting up WebSocket connection to: ws://localhost:12345/api
Connection Successful! Initializing session.
In [4]:
vip.load_data(data, "monthly_website_sales")
In [5]:
data.head(n=3)
Out[5]:
User ID Location Hobby Language Gender Age Married Kids Pets First visit to site (months ago) Last visit to site (months ago) Device used Page visits Number of visits last month Household Income (USD) Time spent (seconds) Amount spent (USD) Number of products viewed Frequency of visits last week Sources
0 413613 1 1 3 Female 17 No Yes Yes 10 2 Tablet 7 219 38407 2205 686.966000 12 6 App
1 463272 1 5 2 Female 15 Yes Yes Yes 30 7 Desktop 15 120 51905 2949 2224.381362 32 30 App
2 437160 4 5 3 Female 48 No No No 27 3 Tablet 10 62 78967 443 353.346000 2 4 App

Smart Mapping - key drivers of 'Amount Spent'

Let's use VIP's Smart Mapping routine to identify the key drivers that caused users to spend more money on the website.

In [6]:
# Here we are running Smart Mapping with the target set to "Amount spent (USD)"
# Instead of specifying which features to include as input to Smart Mapping, we specify which 
# features to exclude from the input. 
vip.smart_mapping(data["Amount spent (USD)"], 
                  exclude=["User ID"])
SmartMapping Rank Feature Correlated Group
0 1 Sources None
1 2 Married None
2 3 Frequency of visits last week None
3 4 Number of products viewed None
4 5 Time spent (seconds) None