This is a demo notebook showing how one can explore stock data with the help of VIP visualization. The main areas of focus in this notebook are stock price visualization and model fitting for stock price prediction.
import time from virtualitics import api import pandas as pd import numpy as np import pandas_datareader.data as web from datetime import datetime import warnings warnings.filterwarnings('ignore')
vip = api.VIP()
Setting up WebSocket connection to: ws://localhost:12345/api Connection Successful! Initializing session.
Set the tickers and timeline of interest
tickers = ['XOM', 'WTI', 'AAPL', 'GOOG', 'F', 'INTC', 'MSFT', 'TSLA', 'FB', 'WMT', 'AMZN', 'BP', 'HP', 'NYT', 'TYO', 'SPY'] # Get information for dates between start and end start = datetime(2015, 1, 1) end = datetime(2018, 12, 1)
df = pd.DataFrame() for ticker in tickers: f = web.DataReader(ticker, 'iex', start, end) f['Series Name'] = ticker df = pd.concat([df, f])
The following cells engineer some features from the basic features returned from the pandas reader. Some of the features we engineer are the average price, returns, price changes, and moving averages. Different time intervals are used for better comparison.
# Calculate average price based on open, close, high, low cols = df.loc[: , "open":"close"] df['avg_price'] = cols.mean(axis=1) df['date'] = df.index # Compute amount sold per day df['amount_sold'] = df['avg_price'] * df['volume']
# We can also look at the change per day df['price_change_per_day'] = df['close'] - df['open'] # Make sure we don't use open and close values from different companies at the boundaries mask = df['Series Name'] != df['Series Name'].shift(1) df['price_change_per_day'].loc[mask == True] = np.nan
# Compute price change per 30 days. mask30 = df['Series Name'] != df['Series Name'].shift(periods=30) df['price_change_per_30_days'] = df['close'] - df['open'].shift(periods=30) df['price_change_per_30_days'].loc[mask30 == True] = np.nan # Compute percentage change per 30 days df['perc_price_change_per_30_days'] = (df['close'] - df['open'].shift(periods=30)) / df['avg_price'].shift(1) df['perc_price_change_per_30_days'].loc[mask30 == True] = np.nan # Let's also look at a simple metric for the returns using the ratio of avg_price/initial price. # This gives a better idea of the profitability of the stock. transf = lambda x : x / x df['returns'] = df.groupby('Series Name')['avg_price'].transform(transf)
Load the data into VIP
Let's first visualize the average stock prices for the various companies. Each color corresponds to a different company. Go into VIP to get more information on the plot.
vip.plot(plot_type='line', x='date', y='avg_price', color='Series Name', y_normalization='log10', x_scale=1.5)