Creating a Trino cluster

Define an insecure cluster (testing)

Create an insecure single node Trino cluster for testing. This can be accessed with the UI/CLI via http without either user/password credentials or authorization.

For testing purposes we use the Trino CLI.

First, ensure all necessary operator have been deployed:

stackablectl operator install \ secret commons hive trino

The Trino cluster can now be deployed:

--- apiVersion: trino.stackable.tech/v1alpha1 kind: TrinoCatalog metadata: name: hive labels: trino: simple-trino spec: connector: hive: metastore: configMap: simple-hive-derby --- apiVersion: trino.stackable.tech/v1alpha1 kind: TrinoCluster metadata: name: simple-trino spec: image: productVersion: 396 stackableVersion: 0.3.0 catalogLabelSelector: matchLabels: trino: simple-trino coordinators: roleGroups: default: replicas: 1 workers: roleGroups: default: replicas: 1 --- apiVersion: hive.stackable.tech/v1alpha1 kind: HiveCluster metadata: name: simple-hive-derby spec: image: productVersion: 3.1.3 stackableVersion: 0.2.0 clusterConfig: database: connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true user: APP password: mine dbType: derby metastore: roleGroups: default: replicas: 1

We have defined a single catalog - Hive - which uses an embedded database (derby).

To interact with Trino, first obtain the host and port for the Trino coordinator service (in this and following examples, https://172.18.0.3:31748):

stackablectl services list PRODUCT NAME NAMESPACE ENDPOINTS EXTRA INFOS hive simple-hive-derby default hive 172.18.0.4:32186 metrics 172.18.0.4:30109 trino simple-trino default coordinator-metrics 172.18.0.3:32123 coordinator-https https://172.18.0.3:31748

Next, download the Trino CLI tool (this can be obtained from the Stackable repository, as shown below):

curl --output trino.jar https://repo.stackable.tech/repository/packages/trino-cli/trino-cli-396-executable.jar

Execute some CLI commands to verify operation, such as returning the names of all catalogs. Note that an insecure connection is specified:

./trino.jar --insecure --debug --server https://172.18.0.3:31748 --user=admin --execute "SHOW CATALOGS" --output-format=CSV_UNQUOTED hive system

Define a secure cluster (production)

For secure connections the following steps must be taken:

  1. Enable authentication

  2. Enable TLS between the clients and coordinator

  3. Enable internal TLS for communication between coordinators and workers

Via authentication

If authentication is enabled, TLS for the coordinator as well as a shared secret for internal communications (this is base64 and not encrypted) must be configured.

Securing the Trino cluster will disable all HTTP ports and disable the web interface on the HTTP port as well. In the definition below the authentication is directed to use the trino-users secret and TLS communication will use a certificate signed by the Secret Operator (indicated by autoTls).

--- apiVersion: trino.stackable.tech/v1alpha1 kind: TrinoCatalog metadata: name: hive labels: trino: simple-trino spec: connector: hive: metastore: configMap: simple-hive-derby --- apiVersion: trino.stackable.tech/v1alpha1 kind: TrinoCluster metadata: name: simple-trino spec: image: productVersion: 396 stackableVersion: 0.3.0 config: tls: secretClass: trino-tls (1) authentication: method: multiUser: userCredentialsSecret: name: trino-users (2) catalogLabelSelector: matchLabels: trino: simple-trino (3) coordinators: roleGroups: default: replicas: 1 workers: roleGroups: default: replicas: 1 --- apiVersion: secrets.stackable.tech/v1alpha1 kind: SecretClass metadata: name: trino-tls (1) spec: backend: autoTls: (4) ca: secret: name: secret-provisioner-trino-tls-ca namespace: default autoGenerate: true --- apiVersion: v1 kind: Secret metadata: name: trino-users (2) type: kubernetes.io/opaque stringData: # admin:admin admin: $2y$10$89xReovvDLacVzRGpjOyAOONnayOgDAyIS2nW9bs5DJT98q17Dy5i --- apiVersion: hive.stackable.tech/v1alpha1 kind: HiveCluster metadata: name: simple-hive-derby spec: image: productVersion: 3.1.3 stackableVersion: 0.2.0 clusterConfig: database: connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true user: APP password: mine dbType: derby metastore: roleGroups: default: replicas: 1
1 The name of (and reference to) the SecretClass
2 The name of (and reference to) the Secret
3 TrinoCatalog reference
4 TLS mechanism

The CLI now requires that a path to the keystore and a password be provided:

./trino.jar --debug --server https://172.18.0.3:31748 --user=admin --keystore-path=<path-to-keystore.p12> --keystore-password=<password>

Via TLS only

This will disable the HTTP port and UI access and encrypt client-server communications.

--- apiVersion: trino.stackable.tech/v1alpha1 kind: TrinoCatalog metadata: name: hive labels: trino: simple-trino spec: connector: hive: metastore: configMap: simple-hive-derby --- apiVersion: trino.stackable.tech/v1alpha1 kind: TrinoCluster metadata: name: simple-trino spec: image: productVersion: 396 stackableVersion: 0.3.0 config: tls: secretClass: trino-tls (1) catalogLabelSelector: matchLabels: trino: simple-trino (2) coordinators: roleGroups: default: replicas: 1 workers: roleGroups: default: replicas: 1 --- apiVersion: secrets.stackable.tech/v1alpha1 kind: SecretClass metadata: name: trino-tls (1) spec: backend: autoTls: (3) ca: secret: name: secret-provisioner-trino-tls-ca namespace: default autoGenerate: true --- apiVersion: hive.stackable.tech/v1alpha1 kind: HiveCluster metadata: name: simple-hive-derby spec: image: productVersion: 3.1.3 stackableVersion: 0.2.0 clusterConfig: database: connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true user: APP password: mine dbType: derby metastore: roleGroups: default: replicas: 1
1 The name of (and reference to) the SecretClass
2 TrinoCatalog reference
3 TLS mechanism

CLI callout:

./trino.jar --debug --server https://172.18.0.3:31748 --keystore-path=<path-to-keystore.p12> --keystore-password=<password>

Via internal TLS

Internal TLS is for encrypted and authenticated communications between coordinators and workers. Since this applies to all the data send and processed between the processes, this may reduce the performance significantly.

--- apiVersion: trino.stackable.tech/v1alpha1 kind: TrinoCatalog metadata: name: hive labels: trino: simple-trino spec: connector: hive: metastore: configMap: simple-hive-derby --- apiVersion: trino.stackable.tech/v1alpha1 kind: TrinoCluster metadata: name: simple-trino spec: image: productVersion: 396 stackableVersion: 0.3.0 config: internalTls: secretClass: trino-internal-tls (1) authentication: method: multiUser: userCredentialsSecret: name: trino-users (2) catalogLabelSelector: matchLabels: trino: simple-trino coordinators: roleGroups: default: replicas: 1 workers: roleGroups: default: replicas: 1 --- apiVersion: secrets.stackable.tech/v1alpha1 kind: SecretClass metadata: name: trino-internal-tls (1) spec: backend: autoTls: (3) ca: secret: name: secret-provisioner-trino-internal-tls-ca namespace: default autoGenerate: true --- apiVersion: v1 kind: Secret metadata: name: trino-users (2) type: kubernetes.io/opaque stringData: # admin:admin admin: $2y$10$89xReovvDLacVzRGpjOyAOONnayOgDAyIS2nW9bs5DJT98q17Dy5i --- apiVersion: hive.stackable.tech/v1alpha1 kind: HiveCluster metadata: name: simple-hive-derby spec: image: productVersion: 3.1.3 stackableVersion: 0.2.0 clusterConfig: database: connString: jdbc:derby:;databaseName=/tmp/metastore_db;create=true user: APP password: mine dbType: derby metastore: roleGroups: default: replicas: 1
1 The name of (and reference to) the SecretClass
2 The name of (and reference to) the Secret
3 TLS mechanism

Since Trino has internal and external communications running over a single port, this will enable the HTTPS port but not expose it. Cluster access is only possible via HTTP.

./trino.jar --debug --server http://172.18.0.3:31748 --user=admin

S3 connection specification

You can specify S3 connection details directly inside the TrinoCatalog specification or by referring to an external S3Connection custom resource.

To specify S3 connection details directly as part of the TrinoCatalog resource, you add an inline connection configuration as shown below:

s3: (1) inline: host: test-minio (2) port: 9000 (3) pathStyleAccess: true (4) secretClass: minio-credentials (5) tls: verification: server: caCert: secretClass: minio-tls-certificates (6)
1 Entry point for the connection configuration
2 Connection host
3 Optional connection port
4 Optional flag if path-style URLs should be used; This defaults to false which means virtual hosted-style URLs are used.
5 Name of the Secret object expected to contain the following keys: accessKey and secretKey
6 Optional TLS settings for encrypted traffic. The secretClass can be provided by the Secret Operator or yourself.

A self provided S3 TLS secret can be specified like this:

--- apiVersion: secrets.stackable.tech/v1alpha1 kind: SecretClass metadata: name: minio-tls-certificates spec: backend: k8sSearch: searchNamespace: pod: {} --- apiVersion: v1 kind: Secret metadata: name: minio-tls-certificates labels: secrets.stackable.tech/class: minio-tls-certificates data: ca.crt: <your-base64-encoded-ca> tls.crt: <your base64-encoded-public-key> tls.key: <your-base64-encoded-private-key>

It is also possible to configure the bucket connection details as a separate Kubernetes resource and only refer to that object from the TrinoCatalog specification like this:

s3: reference: my-connection-resource (1)
1 Name of the connection resource with connection details

The resource named my-connection-resource is then defined as shown below:

--- apiVersion: s3.stackable.tech/v1alpha1 kind: S3Connection metadata: name: my-connection-resource spec: host: test-minio port: 9000 accessStyle: Path credentials: secretClass: minio-credentials

This has the advantage that the connection configuration can be shared across applications and reduces the cost of updating these details.