This document is a guided walkthrough of the “Hello, World!” tutorial Flow. It is recommended that you open this Flow in Virtualitics Predict and progress through it as you follow this walkthrough.
Only a subset of the code used to build the Flow can be found in this walkthrough; the full script can be downloaded from the bottom of this page.
The data for this walkthrough can be downloaded from the bottom of this page.
Before starting this walkthrough, it is important to have some understanding of the class structure and hierarchy within the Virtualitics SDK. Please review our in-depth documentation on creating a Flow and the Virtualitics SDK.
Overall, it is important to keep in mind the following:
-
Elements contain the content that will be displayed to users (e.g. Infographic, Scatterplot).
-
Cards are card-style UI elements and contain one or more Elements.
-
Sections contain Cards and Elements and can be used to display and separate groups with their own titles and descriptions.
-
Pages must contain at least one Section and display everything a user will see at the conclusion of a Step.
-
Steps orchestrate all of the backend and frontend tasks associated with a Page.
Constructing the Flow
When creating Flow scripts, we recommend organizing your file in the following order:
-
Imports
-
Step Definitions
-
Flow and Pages Creation
This sequential organization can be seen in the full script for this Flow provided as a download below. However, for the purpose of this walkthrough, we will go through each Step and its corresponding Page together as it will make for easier reading.
Imports
Every Flow begins by importing the key classes, which include Card, Section, Page, and Step. It is also important to import the various Elements to be used.
You will also need to import:
-
Element and plot types from the Virtualitics SDK
-
Packages you’ll use for data processing, such as NumPy and Pandas
# Import key flow classes
from predict_backend.flow.flow import Flow
from predict_backend.flow.step import Step, StepType
from predict_backend.page import Page, PageType, Section, Card
from predict_backend.store.store_interface import StoreInterface
Creating the Flow Object
Here, the actual Flow Object is created and assigned an image to be associated with it on the Virtualitics Predict Home page.
# Instantiate Flow and assign image
flow_image_link = "https://predict-tutorials.s3-us-gov-west-1.amazonaws.com/hello_world_tile.jpeg"
hello_world = Flow("Hello, World!", "A basic introduction to the design and function of Flows.", flow_image_link)
If set up successfully, this will show the Flow and the relevant tile image on the Virtualitics Predict Home page once the Flow has been uploaded.
Building the Flow
Steps are the primary building blocks of a Flow. They orchestrate all backend and frontend tasks performed on each page. We will now go through each of this Flow’s Steps and their corresponding Pages:
-
Data Upload
-
Data Query
-
Data Visualization
-
Additional Elements
-
Saving Assets
1. Data Upload
First, an empty Section is created that is then placed into a Page with the title "Getting Data Into the Platform."
Next, the DataUpload Step is instantiated and applied to the Page.
# Build data upload step
data_upload_section = Section("", [])
data_upload_page = Page("Getting Data Into the Platform", PageType.INPUT, {}, [data_upload_section])
data_upload_step = DataUpload(
title="Data Upload",
description="Upload the S&P 500 data.",
parent="Inputs",
type=StepType.INPUT,
page=data_upload_page
)
In the Step definition, a DataUpload Element is added to a Card and then placed in the empty Section that is fetched using StoreInterface.
The Section is also given a title and subtitle within the Step.
# This step has the user upload the dataset we'll be using
class DataUpload(Step):
def run(self, flow_metadata):
# Get store_interface and then current page and section
store_interface = StoreInterface(**flow_metadata)
page = store_interface.get_page()
section = page.get_section_by_title("")
# Set section title
upload_section_title = "Loading Data from Databases, Datastores, and Data Lakes"
section.title = upload_section_title
# Create DataUpload Card
guide_link = "https://docs.virtualitics.com/hc/en-us/articles/24831125452307-Configure-a-Connection"
data_link = "https://docs.virtualitics.com/hc/en-us/articles/24831066190995-Flow-Building-Guide-1-Hello-World"
upload_subtitle1 = f"To learn more about how to load data from databases, datastores, and data lakes,
check out this link: {guide_link}"
upload_subtitle2 = f"You can find the relevant data for this tutorial here: {data_link}"
upload_subtitle = upload_subtitle1 + "\n \n" + upload_subtitle2
data_upload_card = Card(
title="Data Source Title (Optional): Upload the stock ticker data!",
content=[DataSource(title="S&P 500 Dataset", options=["csv"], description=upload_subtitle,
show_title=False)],
)
# Add card to section, set section title and subtitle, and update page
page.add_card_to_section(data_upload_card, "")
store_interface.update_page(page)
Altogether, this creates a Page which prompts the user to upload specified data in CSV format.
2. Data Query
Once again, a Section and Page are created first and then the defined Step is instantiated.
In this Step, Elements are created to collect user input. Since some of these inputs will be used to query the data uploaded in the previous step, the data needs to be accessed to determine the bounds to be set within the relevant input Elements.
This can done by using the StoreInterface get_element_value()
method and specifying the Page and Element title of the DataUpload Element to return a Pandas DataFrame.
# Fetch input data and extract query boundaries for stock name and date
data = store_interface.get_element_value(data_upload_step.name, "S&P 500 Dataset")
stock_names = sorted(data['Name'].unique())
Now, data input Elements, such as a Dropdown selection, can be built using the bounds that have been calculated.
# Create dropdown card
stock_dropdown_title = "Dropdown Title (Optional): Which stock do you want to do an individual analysis on?"
stock_dropdown = Dropdown(
options=stock_names,
title=stock_dropdown_title,
description="Description for dropdown (Optional)"
)
This creates a Dropdown selection, which prompts the user to choose their next course of action.
As this Step contains some data processing, the StoreInterface method update_progress()
is used to keep the user updated on the progression of the Step as it runs.
# This step updates the user on the progress of the step as it processes
store_interface.update_progress(5, "Creating query selection options")
The percent completion and message used will be displayed at the bottom of the Page in real time.
3. Visualize Data
In this Step, the query parameters previously input by the user are retrieved by, again, using the StoreInterface get_element_value()
method.
# Fetch query parameters from previous step
stock_dropdown_title = "Dropdown Title (Optional): Which stock do you want to do an individual analysis on?"
stock_analyzing = store_interface.get_element_value(data_query_step.name, stock_dropdown_title)
The data can then be queried according to the user-specified parameters and used to create visualizations such as a ScatterPlot.
# Create scatter plot
scatter_plot = create_scatter_plot(
data[data.ticker == stock_analyzing],
"volume",
"pct_change",
color_by="change_type",
plot_title=f"Scatter Plot: {stock_analyzing} Volume vs % Change in Daily Price"
)
The visualization will be displayed on the Page in real time.
Additionally, plots can be placed in a Dashboard. Dashboards are great for displaying Elements in a row-wise and column-wise manner.
# Create a dashboard and add our plot elements to it
plots_dashboard = Dashboard([Column([scatter_plot, line_plot, bar_plot])])
# Add the dashboard to card
plots_card = Card("Plots", [plots_dashboard])
4. Additional Elements
In this next Step, which can be viewed in detail in the downloadable script, two more robust capabilities in Virtualitics Predict are explored: Infographics and the ability to create powerful, multi-dimensional visualizations.
Infographics are an excellent Element for displaying KPIs, progress statuses, and other summarizing information.
Virtualitics Predict also has the ability to connect with Virtualitics' AI-powered data visualization and exploration tool - Explore. This allows users to render extremely useful and robust visualizations and then display them in a Page within a Flow. It also enables users to open an active Flow’s data in a project with which the user can immediately interact. This is accomplished using a CustomEvent.
5. Saving Assets
Assets, such as Datasets and Models, can be saved in a user’s Virtualitics Predict workspace to either be downloaded or accessed later on.
Saving an Asset is as simple as using the StoreInterface save_asset()
method and specifying the Asset to be saved, including a label and name.
# Retrieve preprocessed dataframe from previous step and save as an asset
data = store_interface.get_input("Preprocessed Data")
dataset = Dataset(dataset=data, label="My Dataset", name="sp500 dataset")
store_interface.save_asset(dataset)
Assets can be found by clicking Assets in the left-side navigation bar in Virtualitics Predict.
Completing a Flow
The final step in constructing a Flow involves chaining together each Step in the Flow Object that was created at the start.
To do this, an ordered list of the Steps is passed to the Flow’s chain()
method.
# Chain together steps of flow
hello_world.chain([
data_upload_step,
data_query_step,
data_visualization_step,
additional_elements_step,
save_assets_step
])
Uploading a Flow
After completing this final step, the Flow file can be uploaded to Virtualitics Predict using the Create Flow button at the top right of the Virtualitics Predict Home page.
You’ll be alerted if any errors are detected in your Flow. If no errors are detected, your Flow will appear on the Virtualitics Predict Home page.
Next Article |