Whoops…Nothing found

Try other keywords in your search

Configure a Connection

 2 Minutes

 0 Likes

 194 Views

When is this applicable?

When you would like to make a live connection between a database and your Virtualitics AI Platform Flow.

 

How-To

For security and convenience purposes all connections used in Virtualitics AI Platform Flows should be stored and retrieved from the connection store. To create a connection, encrypt it and store it in the connection store. For use in a Flow, you must make a POST request to the controller/connection/create/ endpoint.

 

Required Fields

All connections require a connection_name and a connection_type.

A connection name may be any utf-8 string. The currently valid connection_type values are:

  • ‘postgresql’
  • ‘mysql’
  • ‘mssql+pyodbc’
  • ‘mssql+pymysql’
  • ‘oracle’
  • ‘sqlalchemy-redshift’
  • 'databricks'

 

RDBMS Like Connections

  • connection_type - 'postgresql', 'mysql', 'mssql+pyodbc', 'mssql+pymysql', 'oracle', ‘sqlalchemy-redshift’
  • user_name - database user name
  • password - corresponding password must pass as {"value": "mypassword", "encrypt": True}
  • host - the database host name
  • port - the port the database receives connections on
  • dbname - the name of the schema/db to create the connection in

 

Databricks

  • connection_type - “databricks”
  • server_hostname - the databricks cloud host name eg. “123456-789.cloud.databricks.com”
  • access_token - a valid access token to the databricks cluster. must pass as {"value": "myaccesstoken", "encrypt": True}
  • http_path - the http_path value, you can find this in the cluster advanced details under the JDBC section (eg. /sql/protocolv1/0/…..)

 

Create the Payload

Examples are in Python but this can be translated to any language (curl, postman, js, etc.)

Databricks

connection_name = "my-databricks-connection"
connection_data = {
    "connection_type": "databricks",
    "server_hostname": "dbc-6b897c4b-c971.cloud.databricks.com",
    "access_token": {
        "value": "asdfasdfg802yur98ehbgvasdfg",
        "encrypt": True
    },
    "http_path": "ql/protocolv1/o/2198091692694528/0517-12345-asdasd"
}

RDBMS like (postgres/mysql, mssql, etc)

connection_name = "my-postgres-connection"
connection_data = {
   "connection_type": "postgresql",
   "user_name": "ben",
   "password": {
       "value": "secret-password",
       "encrypt": True
   },
   "host": "localhost",
   "port": 5432
}

From a Python Shell

This method is more convenient for local environments 

from predict_backend.store.connection_store import connection_store
from copy import deepcopy
connection_store.create_connection(name=connection_name, 
                                  item=deepcopy(connection_data))
print(connection_store.get_connection(name))

Using the REST API

Open a console tab in your browser for the deployment to which you are adding a connection. With the network tab open in the console load the Flows page from the left side menu. You will see a lot of network requests being made. Find the request being made and right click on it and select Copyas curl

You’ll have something like this:

curl 'http://localhost:5000/state/flows/' 
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0' 
-H 'Accept: application/json, text/plain, */*' -H 'Accept-Language: en-US,en;q=0.5' 
-H 'Accept-Encoding: gzip, deflate, br' 
-H 'Origin: http://localhost:3000' 
-H 'DNT: 1' 
-H 'Connection: keep-alive' 
-H 'Referer: http://localhost:3000/' 
-H 'Cookie: session=eyJ1c2VyIjoiYmVuLmVAdmlydHVhbGl0aWNzLmNvbSIsIl9asdfJtYW5lbnQiOnRydWUasdfmxvZ2dlZF9pbiI6dHJ1ZX0.YuCCFw.S-tgiqJQYxYdDOx5asdfQZ0Ffod4; _xsrf=2|3a9bd078|b0bb826d021f3777166d602fef83298a|1658851384'
-H 'Sec-Fetch-Dest: empty' 
-H 'Sec-Fetch-Mode: cors' 
-H 'Sec-Fetch-Site: same-site'

The only relevant values will be the cookie values (Firefox will have the one compound session cookie, but Chrome usually provides several cookies. You’ll need to convert these cookies to a Python dictionary (to pass as cookies in the request). My example only has one cookie but yours may have 2 or more with different names.

cookies = {
   "session": "eyJ1c2VyIjoiYmVuLmVAdmlydHVhbGl0aWNzLmNvbSIsIl9asdfJtYW5lbnQiOnRydWUasdfmxvZ2dlZF9pbiI6dHJ1ZX0.YuCCFw.S-tgiqJQYxYdDOx5asdfQZ0Ffod4; _xsrf=2|asdf312|123123123|1658851384"
}

Chrome users may have something like this instead:

cookies = {
   'route': "2|asdf312|123123123|1658851384",
   '__Host-session': "eyJ1c2VyIjoiYmVuLmVAdmlydHVhbGl0aWNzLmNvbSIsIl9asdfJtYW5lbnQiOnRydWUasdfmxvZ2dlZF9pbiI6dHJ1ZX0.YuCCFw.S-tgiqJQYxYdDOx5asdfQZ0Ffod4",
}

You will also need the API endpoint (http://localhost:5000 in this example). Now you have all of the data you need to make a request through the authenticated endpoint.

Combine everything and make a POST request to the controller/create/connection/ endpoint.

import json
import requests

url = 'http://localhost:5000/controller/connection/create/'  # <-- url copied from the browser console
headers = {
   'Accept': 'application/json, text/plain, */*',
   'Accept-Language': 'en-US,en;q=0.9',
   'Connection': 'keep-alive',
   'Sec-Fetch-Dest': 'empty',
   'Sec-Fetch-Mode': 'cors',
   'Sec-Fetch-Site': 'same-site',
   'Content-Type': 'application/json'
}
cookies = {"session": "eyJ1c2VyIjoiYmVuLmVAdmlydHVhbGl0aWNzLmNvbSIsIl9asdfJtYW5lbnQiOnRydWUasdfmxvZ2dlZF9pbiI6dHJ1ZX0.YuCCFw.S-tgiqJQYxYdDOx5asdfQZ0Ffod4; _xsrf=2|asdf312|123123123|1658851384"}

connection_name = "my-postgres-connection"
connection_data = {
   "connection_type": "postgresql",
   "user_name": "ben",
   "password": {"value": "secret-password", "encrypt": True},
   "host": "localhost",
   "port": 5432
}
payload = json.dumps({"name": connection_name, "connection_data": connection_data})
response = requests.request("POST", url, cookies=cookies, headers=headers, data=payload)

print(response)

 

What to Expect (Validation)

Example usage

Once the connection has been created you can now refer to it within a Flow. This connection creation step only needs to run the very first time a cluster is deployed. If you are running locally this may mean the connection needs to be created if you bring your cluster down and then back up (particularly this applies to the postgres / store container).

DATABRICKS_CONN_ID = 'databricks-default'
device_data = odbc_to_pandas("select * from datasets.gearbox_device_data",
                                    DATABRICKS_CONN_ID)

If you receive the error: No connection found with name: databricks-default. Please check your connection name and verify that a connection with that name exists in the connection store. or similar when attempting to run a Flow that uses a Databricks connection, you will need to create a connection to the Databricks cluster (this is a one time per deployment operation).

Was this article helpful?