For security and convenience purposes all connections used in Virtualitics Predict Flows should be stored and retrieved from the connection store.
To create a connection, encrypt it and store it in the connection store. For use in a Flow, you must make a POST request to the controller/connection/create/
endpoint.
Required Fields
All connections require a connection_name
and a connection_type
.
A connection name may be any utf-8 string. The currently valid connection_type
values are:
-
‘postgresql’
-
‘mysql’
-
‘mssql+pyodbc’
-
‘mssql+pymysql’
-
‘oracle’
-
‘sqlalchemy-redshift’
-
'databricks'
RDBMS Like Connections
-
connection_type
-'postgresql', 'mysql', 'mssql+pyodbc', 'mssql+pymysql', 'oracle', ‘sqlalchemy-redshift’
-
user_name
- database user name -
password
- corresponding password must pass as{"value": "mypassword", "encrypt": True}
-
host
- the database host name -
port
- the port the database receives connections on -
dbname
- the name of the schema/db to create the connection in
Databricks
-
connection_type
- “databricks” -
server_hostname
- the Databricks cloud host name eg. “123456-789.cloud.databricks.com” -
access_token
- a valid access token to the Databricks cluster. must pass as{"value": "myaccesstoken", "encrypt": True}
-
http_path
- the http_path value, which can be found in the cluster advanced details under the JDBC section (eg. /sql/protocolv1/0/…..)
Create the Payload
Databricks
connection_name = "my-databricks-connection"
connection_data = {
"connection_type": "databricks",
"server_hostname": "dbc-6b897c4b-c971.cloud.databricks.com",
"access_token": {
"value": "asdfasdfg802yur98ehbgvasdfg",
"encrypt": True
},
"http_path": "ql/protocolv1/o/2198091692694528/0517-12345-asdasd"
}
RDBMS Like (Postgres/MySQL, MSSQL, etc.)
connection_name = "my-postgres-connection"
connection_data = {
"connection_type": "postgresql",
"user_name": "ben",
"password": {
"value": "secret-password",
"encrypt": True
},
"host": "localhost",
"port": 5432
}
From a Python Shell
This method is more convenient for local environments.
from predict_backend.store.connection_store import connection_store
from copy import deepcopy
connection_store.create_connection(name=connection_name,
item=deepcopy(connection_data))
print(connection_store.get_connection(name))
Using the REST API
Open a console tab in your browser for the deployment to which you are adding a connection. With the network tab open in the console, load the Flows page from the left side menu.
You will see a lot of network requests being made. Find the request being made, right click on it and select Copy
→ as curl
You’ll have something like this:
curl 'http://localhost:5000/state/flows/'
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Firefox/102.0'
-H 'Accept: application/json, text/plain, */*' -H 'Accept-Language: en-US,en;q=0.5'
-H 'Accept-Encoding: gzip, deflate, br'
-H 'Origin: http://localhost:3000'
-H 'DNT: 1'
-H 'Connection: keep-alive'
-H 'Referer: http://localhost:3000/'
-H 'Cookie: session=eyJ1c2VyIjoiYmVuLmVAdmlydHVhbGl0aWNzLmNvbSIsIl9asdfJtYW5lbnQiOnRydWUasdfmxvZ2dlZF9pbiI6dHJ1ZX0.YuCCFw.S-tgiqJQYxYdDOx5asdfQZ0Ffod4; _xsrf=2|3a9bd078|b0bb826d021f3777166d602fef83298a|1658851384'
-H 'Sec-Fetch-Dest: empty'
-H 'Sec-Fetch-Mode: cors'
-H 'Sec-Fetch-Site: same-site'
The only relevant values will be the cookie values (Firefox will have the one compound session cookie, but Chrome usually provides several cookies).
You’ll need to convert these cookies to a Python dictionary (to pass as cookies in the request). The below example only has one cookie but yours may have 2 or more with different names.
cookies = {
"session": "eyJ1c2VyIjoiYmVuLmVAdmlydHVhbGl0aWNzLmNvbSIsIl9asdfJtYW5lbnQiOnRydWUasdfmxvZ2dlZF9pbiI6dHJ1ZX0.YuCCFw.S-tgiqJQYxYdDOx5asdfQZ0Ffod4; _xsrf=2|asdf312|123123123|1658851384"
}
Chrome users may see something like this:
cookies = {
'route': "2|asdf312|123123123|1658851384",
'__Host-session': "eyJ1c2VyIjoiYmVuLmVAdmlydHVhbGl0aWNzLmNvbSIsIl9asdfJtYW5lbnQiOnRydWUasdfmxvZ2dlZF9pbiI6dHJ1ZX0.YuCCFw.S-tgiqJQYxYdDOx5asdfQZ0Ffod4",
}
You will also need the API endpoint (http://localhost:5000
in this example). Now, you have all of the data you need to make a request through the authenticated endpoint.
Combine everything and make a POST request to the controller/create/connection/
endpoint.
import json
import requests
url = 'http://localhost:5000/controller/connection/create/' # <-- url copied from the browser console
headers = {
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.9',
'Connection': 'keep-alive',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-site',
'Content-Type': 'application/json'
}
cookies = {"session": "eyJ1c2VyIjoiYmVuLmVAdmlydHVhbGl0aWNzLmNvbSIsIl9asdfJtYW5lbnQiOnRydWUasdfmxvZ2dlZF9pbiI6dHJ1ZX0.YuCCFw.S-tgiqJQYxYdDOx5asdfQZ0Ffod4; _xsrf=2|asdf312|123123123|1658851384"}
connection_name = "my-postgres-connection"
connection_data = {
"connection_type": "postgresql",
"user_name": "ben",
"password": {"value": "secret-password", "encrypt": True},
"host": "localhost",
"port": 5432
}
payload = json.dumps({"name": connection_name, "connection_data": connection_data})
response = requests.request("POST", url, cookies=cookies, headers=headers, data=payload)
print(response)
What to Expect
Once the connection has been created, you can now refer to it within a Flow.
This connection creation step only needs to run the very first time a cluster is deployed. If you are running locally, it may mean the connection needs to be created if you bring your cluster down and then back up (particularly applicable to the postgres / store container).
DATABRICKS_CONN_ID = 'databricks-default'
device_data = odbc_to_pandas("select * from datasets.gearbox_device_data",
DATABRICKS_CONN_ID)
Databricks Error Resolution
If you receive this error:
No connection found with name: databricks-default. Please check your connection name and verify that a connection with that name exists in the connection store.
or similar when attempting to run a Flow that uses a Databricks connection, you will need to create a connection to the Databricks cluster (this is a one time per-deployment operation).
Previous Article |
Next Article |