Installation and Setup (Kubernetes)

Hyperscale Compliance is designed to run and is supported on any Certified Kubernetes platform (https://www.cncf.io/certification/software-conformance) that supports Helm (https://helm.sh/docs/topics/kubernetes_distros/). Microk8s has been explicitly tested by Delphix and is recommended for use. The product is also OCI-compliant and may use any container runtime within a certified Kubernetes platform that implements the OCI Runtime Specification including CRI-O, Docker, and Podman.
Delphix regularly tests against a range of popular Kubernetes platforms to cover a representative sample of implementations. The following Kubernetes platforms have been explicitly tested by Delphix and are recommended for use: Microk8s, AWS EKS, AZURE AKS and OpenShift on VMWare vSphere.

Installation Requirements

To deploy Hyperscale Compliance via Kubernetes, a running Kubernetes cluster is required to run, the kubectl command line tool to interact with the Kubernetes cluster and HELM for deployment onto the cluster.

Requirement

Recommended Version

Comments

Kubernetes Cluster

1.25 or above

If you want to install MicroK8s, follow the steps mentioned in the MicroK8s on Linux (online mode).

HELM

3.9.0 or above

HELM installation should support HELM v3. More information on HELM can be found at https://helm.sh/docs/. To install HELM, follow the installation instructions at https://helm.sh/docs/intro/install/.

The installation also requires access to the HELM repository from where Hyperscale charts can be downloaded. The HELM repository URL is https://dlpx-helm-hyperscale.s3.amazonaws.com.

kubectl

1.25.0 or above

If an intermediate HELM repository is to be used instead of the default Delphix HELM repository, then the repository URL, username, and password to access this repository needs to be configured in the values.yaml file under the imageCredentials section.
HELM will internally refer to the kubeconfig file to connect to the Kubernetes cluster. The default kubeconfig file is present at location: ~/.kube/config.
If the kubeconfig file needs to be overridden while running HELM commands, set the KUBECONFIG environment variable to the location of the kubeconfig file.
Oracle Load doesn’t support Object Identifiers(OIDs).

Installation

Download the HELM charts

The latest version of the chart can be pulled locally with the following command (where x.x.x should be changed to the version of Hyperscale being installed):

curl -XGET https://dlpx-helm-hyperscale.s3.amazonaws.com/hyperscale-helm-x.x.x.tgz -o hyperscale-helm-x.x.x.tgz

This command will download a file with the name hyperscale-helm-x.x.x.tgz in the current working directory. The downloaded file can be extracted using the following command (where x.x.x should be changed to the version of Hyperscale being installed):

tar -xvf hyperscale-helm-x.x.x.tgz

This will extract into the following directory structure:

CODE

hyperscale-helm
    ├── Chart.yaml
    ├── README.md
    ├── templates
        │-<all templates files>
    ├── tools
        │-<all tool files>
    ├── values-file-connector.yaml
    ├── values-mongo.yaml
    ├── values-snowflake.yaml
    ├── values-mssql.yaml
    ├── values-oracle.yaml
    └── values.yaml

Verify the authenticity of the downloaded HELM charts

The SHA-256 hash sum of the downloaded helm chart tarball file can be verified as follows:

Execute the below command and note the digest value for version x.x.x (where x.x.x should be changed to the version of Hyperscale being installed)
curl https://dlpx-helm-hyperscale.s3.amazonaws.com/index.yaml
Execute the sha256sum command (or equivalent) on the downloaded file (where x.x.x should be changed to the version of Hyperscale being installed) (hyperscale-helm-x.x.x.tgz)
sha256sum hyperscale-helm-x.x.x.tgz

The value generated by the sha256sum utility in step 2 must match the digest value noted in step 1.

Configure Registry Credentials for Docker Images

For pulling the Docker images from the registry, permanent credentials associated with your Delphix account would need to be configured in the values.yaml file. To get these permanent credentials, visit the Hyperscale Compliance Download page and log in with your credentials. Once logged in, select the Hyperscale HELM Repository link and accept the Terms and Conditions. Once accepted, credentials for the docker image registry will be presented. Note them down and edit the imageCredentials.username and imageCredentials.password properties in the values.yaml file as shown below:

CODE

# Credentials to fetch Docker images from Delphix internal repository
      imageCredentials:
# Username to login to docker registry
      username: <username>
# Password to login to docker registry
      password: <password>

Delphix will delete unused credentials after 30 days and inactive (but previously used) credentials after 90 days.

Helm chart configuration files

hyperscale-helm is the name of the folder that was extracted in the previous step. In the above directory structure, there are essentially two files that come into play while attempting to install the helm chart:

A values.yaml configuration file that contains configurable properties, common to all the services, with their default values.
A values-[connector-type].yaml configuration file that contains configurable properties, applicable to the services of the specific connector, with their default values.

The following sections talk about some of the important properties that will need to be configured correctly for a successful deployment. A full list of the configurable properties can be found on the Configuration Settings page.

(Mandatory) Configure the staging area volume

A volume will need to be mounted, via persistent volume claims, inside the pods that will provide the storage for the service databases for each hyperscale compliance service. By default, a persistent volume claim, using the default storage class, will be requested on the cluster. This can be configured, for some or all services, in one of the following ways that involves setting/overriding properties in the values.yaml configuration file:

Set the value for stagingstorageType as MOUNT (for NFS Server), AWS_S3 (for S3 bucket) or AZURE_BLOB_STORAGE (for Azure Blob Containers).
If stagingstorageType is MOUNT, set values for the following properties if the cluster needs to mount an NFS shared path from an NFS server. For information about setting up and configuring an NFS server for the staging area volume, refer to NFS Server Installation.

nfsStorageHost
nfsStorageExportPath
nfsStorageMountOption
nfsStorageMountType

If stagingstorageType is AWS_S3, set values for the following properties as required if the cluster needs to bind the pods to a persistent volume using the S3 CSI driver. Refer to Configuring AWS S3 bucket as staging area for more details.

authMechanism
awsBucketName
awsBucketRegion
awsBucketPrefix
awsBucketDelimiter
awsAccessKey
awsSecretKey
stagingAwsS3SecretName

If stagingstorageType is AZURE_BLOB_STORAGE, set values for the following properties as required if the cluster needs to bind the pods to a persistent volume using the Azure Blob CSI driver. Refer to Configuring Azure Blob containers as staging area for more details.

blobContainerName
blobContainerPrefix
blobContainerDelimiter
blobSecretKey
stagingAzureBlobSecretName

Installing the helm chart with these properties set will create a persistent volume on the cluster. As such, the user installing the helm chart should either be a cluster-admin or should have the privileges to be able to create persistent volume on the cluster. Otherwise, either of the following set of properties must also be set:

stagePvcName: Set this property if the cluster needs to bind the pods to a persistent volume claim. Note that until this PVC is bound to a backing PV, the pods will not start getting created and as such, the cluster admin should ensure that the backing PV is either statically provisioned or dynamically provisioned based on the storage class associated with PVC.
stagePvName and stageStorageClass: Set these properties if the cluster needs to bind the pods to a persistent volume with the associated storage class name. Once the helm chart installation starts, a PVC will be created that is managed by the helm. Note: stageStorageClass is not required for AWS S3.

The following properties are supporting/optional properties that can be overridden along with the above properties:

nfsStorageMountOption: If nfsStorageHost and nfsStorageExportPath have been set, set the appropriate mount option if you would like the cluster to mount with an option other than the default option of nfsvers=4.2.
stageAccessMode and stageStorageSize: Persistent Volume claims can request specific storage capacity size and access modes.

(Mandatory for Oracle) Configure the instantclient volume

A volume will need to be mounted, via persistent volume claims, inside the Oracle load service that will provide access to Oracle’s instantclient binaries. This can be configured by one of the following ways that involves setting/overriding some properties in the values-oracle.yaml configuration file:

nfsInstantClientHost and nfsInstantClientExportPath: Set values for these properties if the cluster needs to mount an NFS shared path from an NFS server.

Note: Installing the helm chart with these properties set will create a persistent volume on the cluster. As such, the user installing the helm chart should either be a cluster-admin or should have the privileges to be able to create persistent volume on the cluster.

instantClientPvcName: Set this property if the cluster needs to bind the pods to a persistent volume claim. Note that until this PVC is bound to a backing PV, the pods will not start getting created and as such, the cluster admin should ensure that the backing PV is either manually provisioned or dynamically provisioned based on the storage class associated with PVC.
instantClientPvName and instantClientStorageClass: Set these properties if the cluster needs to bind the pods to a persistent volume with the associated storage class name. Once the helm chart installation starts, a PVC will be created that is managed by the helm.

The following properties are supporting/optional properties that can be overridden along with the above properties:

instantClientMountOption: If nfsInstantClientHost and nfsInstantClientExportPath have been set, set the appropriate mount option if you would like the cluster to mount with an option other than the default option of nfsvers=4.2.
instantClientAccessMode and instantClientStorageSize: Persistent Volume claims can request specific storage capacity size and access modes.

(Mandatory for File Connector) Configure the source and target connector type and (optionally) the source and target volumes

UnloadFSMount and loadFSMount (earlier unloadStorageType= FS and loadStorageType = FS):

To use the filesystem source and target connector types, you will need to configure persistent volumes using the nfsUnloadStorage options, then uncomment and set the values of unloadFSMount and loadFSMount to true. If these values are set to true, a volume will need to be mounted, via persistent volume claims, inside the file-connector unload service that will provide access to the source file location and inside the load service that will provide access to the target file location.

UnloadHadoopMount and loadHadoopMount (earlier unloadStorageType= Hadoop and loadStorageType = Hadoop):

To use the Hadoop source and target connector types and configure persistent volumes using the nfsHadoopStorage options, you will need to uncomment and set the values of unloadHadoopMount and loadHadoopMount to true. If these values are set to true, a volume will need to be mounted to add the Hadoop configuration files, via persistent volume claims, inside the file-connector unload and load service that will provide access to the Hadoop configuration files.

These can be configured in one of the following ways that involves setting/overriding some properties in the values-file-connector.yaml configuration file:

nfsUnloadStorageHost, nfsUnloadStorageExportPath, nfsLoadStorageHost, and nfsLoadStorageExportPath: Set values for these properties if the cluster needs to mount an NFS shared path from an NFS server.

Note: Installing the helm chart with these properties set will create a persistent volume on the cluster. As such, the user installing the helm chart should either be a cluster admin or should have the privileges to be able to create persistent volume on the cluster.

unloadStoragePvcName and loadStoragePvcName: Set these properties if the cluster needs to bind the pods to a persistent volume claim. Note that until this PVC is bound to a backing PV, the pods will not start getting created and as such, the cluster admin should ensure that the backing PV is either manually provisioned or dynamically provisioned based on the storage class associated with PVC.
unloadStoragePvName, unloadStorageClass, loadStoragePvName, and loadStorageClass: Set these properties if the cluster needs to bind the pods to a persistent volume with the associated storage class name. Once the helm chart installation starts, a PVC will be created that is managed by the helm.

The following properties are supporting/optional properties that can be overridden along with the above properties:

unloadStorageMountOption and loadStorageMountOption: If nfsUnloadStorageHost, nfsUnloadStorageExportPath, nfsLoadStorageHost, and nfsLoadStorageExportPath are configured, set the appropriate mount option that you would like the cluster to use to mount the storage option. Uncomment the line for nfsvers=4.2.
unloadStorageSize, unloadStorageAccessMode, loadStorageSize, and loadStorageAccessMode: Persistent Volume claims can request specific storage capacity size and access modes.

Optionally, if you would like to use PySpark as the data writer type, you may configure it under the unload and load service property values by uncommenting the line and setting dataWriterType: pyspark.

To enable the staging push feature for unload and load services, set skipUnloadWriters and skipLoadWritersto true. Alternatively, you can provide the format file instead of using the staging push feature. To enable this option, set the userProvidedFormatFile to true.

Note: Configurations such as dataWriterType, skipLoadWriters and userProvidedFormatFile can now be configured independently for each job using the source_configs and target_configs in job configuration.

(Optional) Configure the service database volumes

A volume will need to be mounted, via persistent volume claims, inside the pods that will provide the storage for the service databases for each hyperscale compliance service. By default, a persistent volume claim, using the default storage class, will be requested on the cluster. This can be configured, for some or all services, in one of the following ways that involves setting/overriding properties in the values.yaml configuration file:

[service-name].dbPvcName: Set this property if the cluster needs to bind the pods to a persistent volume claim. Note that until this PVC is bound to a backing PV, the pods will not get created and as such, the cluster admin should ensure that the backing PV is either manually provisioned or dynamically provisioned based on the storage class associated with PVC. The service database names default to controller-db , unload-db, masking-db and load-db for the controller, unload, masking and load services respectively.
[service-name].databaseStorageSize: Set this property if the cluster should request a PVC with a storage size to something other than the pre-configured size.
storageClassName: Set this property if the cluster should request a PVC using a specific storage class.

(Optional) Configure the cluster node for each service

By default, pods will be scheduled on the node(s) determined by the cluster. Set a node name under the [service-name].nodeName property for the service(s) if you would like to request the cluster to schedule pods on particular node(s).

(Optional) Set resource requests and limits

Some users may have default container settings as part of their Kubernetes or OpenShift infrastructure management. Sometimes, it is important to alter those settings for Hyperscale containers. You can configure resource requests and limits for each Hyperscale container like the following:

CODE

controller: 
  resources: 
    requests: 
      memory: "256Mi" 
      cpu: "100m" 
    limit: 
      memory: "512Mi" 
      cpu: "500m"

The above example is only for controller service. You can can configure properties for other services (load, unload and masking) in the same way. Note, the example above includes sample values, and you may need to contact your infrastructure team to determine these values.

Install the Helm Chart

Once the desired properties have been set/overridden, proceed to install the helm chart by running:

CODE


helm install hyperscale-helm <directory path of the extracted chart> -f values-[connector-type].yaml

Check for the Successful Installation

After installing the helm chart, check the status of the helm chart and the pods using the following commands:

CODE

$ helm list
NAME              NAMESPACE    REVISION    UPDATED                                 STATUS      CHART                    APP VERSION
hyperscale-helm      default      1           2023-04-17 05:38:17.639357049 +0000 UTC        deployed    hyperscale-helm-18.0.0

CODE

$ kubectl get pods --namespace=hyperscale-services

NAME                                  READY   STATUS    RESTARTS   AGE

controller-service-65575b6458-2q9b4   1/1     Running   0          125m

load-service-5c644b9cc8-g9fs8         1/1     Running   0          125m

masking-service-7ddfd49c8f-5j2q5      1/1     Running   0          125m

proxy-5bd8d8f589-gkx8g                1/1     Running   0          125m

unload-service-55b5bd8cc8-7z95b       1/1     Running   0          125m

Creating ingress controller and ingress resource

After successfully deploying the Hyperscale Compliance services on the Kubernetes cluster, the final step involves creating an ingress route to manage external traffic to the services efficiently. For instructions, follow the steps as documented in the Ingress Setup.