Installation and Setup (Kubernetes)
Hyperscale Compliance is designed to run and is supported on any Certified Kubernetes platform (https://www.cncf.io/certification/software-conformance) that supports Helm (https://helm.sh/docs/topics/kubernetes_distros/). Microk8s has been explicitly tested by Delphix and is recommended for use. The product is also OCI-compliant and may use any container runtime within a certified Kubernetes platform that implements the OCI Runtime Specification including CRI-O, Docker, and Podman.
Delphix regularly tests against a range of popular Kubernetes platforms to cover a representative sample of implementations. The following Kubernetes platforms have been explicitly tested by Delphix and are recommended for use: Microk8s, AWS EKS, and OpenShift on VMWare vSphere.
Installation Requirements
To deploy Hyperscale Compliance via Kubernetes, a running Kubernetes cluster is required to run, the kubectl
command line tool to interact with the Kubernetes cluster and HELM for deployment onto the cluster.
Requirement | Recommended Version | Comments |
---|---|---|
Kubernetes Cluster | 1.25 or above | If you want to install MicroK8s, follow the steps mentioned in the MicroK8s on Linux (online mode). |
HELM | 3.9.0 or above | HELM installation should support HELM v3. More information on HELM can be found at The installation also requires access to the HELM repository from where Hyperscale charts can be downloaded. The HELM repository URL is |
kubectl | 1.25.0 or above |
If an intermediate HELM repository is to be used instead of the default Delphix HELM repository, then the repository URL, username, and password to access this repository needs to be configured in the
values.yaml
file under the imageCredentials section.HELM will internally refer to the kubeconfig file to connect to the Kubernetes cluster. The default kubeconfig file is present at location:
~/.kube/config
.If the kubeconfig file needs to be overridden while running HELM commands, set the KUBECONFIG environment variable to the location of the kubeconfig file.
Oracle Load doesn’t support Object Identifiers(OIDs).
Installation
Download the HELM charts
The latest version of the chart can be pulled locally with the following command (where x.x.x
should be changed to the version of Hyperscale being installed):
curl -XGET https://dlpx-helm-hyperscale.s3.amazonaws.com/hyperscale-helm-x.x.x.tgz -o hyperscale-helm-x.x.x.tgz
This command will download a file with the name hyperscale-helm-x.x.x.tgz
in the current working directory. The downloaded file can be extracted using the following command (where x.x.x
should be changed to the version of Hyperscale being installed):
tar -xvf hyperscale-helm-x.x.x.tgz
This will extract into the following directory structure:
hyperscale-helm
├── Chart.yaml
├── README.md
├── templates
│-<all templates files>
├── tools
│-<all tool files>
├── values-file-connector.yaml
├── values-mongo.yaml
├── values-mssql.yaml
├── values-oracle.yaml
└── values.yaml
Verify the authenticity of the downloaded HELM charts
The SHA-256 hash sum of the downloaded helm chart tarball file can be verified as follows:
Execute the below command and note the digest value for version x.x.x (where
x.x.x
should be changed to the version of Hyperscale being installed)
curlhttps://dlpx-helm-hyperscale.s3.amazonaws.com/index.yaml
Execute the sha256sum command (or equivalent) on the downloaded file (where
x.x.x
should be changed to the version of Hyperscale being installed) (hyperscale-helm-x.x.x.tgz)sha256sum hyperscale-helm-x.x.x.tgz
The value generated by the sha256sum utility in step 2 must match the digest value noted in step 1.
Configure Registry Credentials for Docker Images
For pulling the Docker images from the registry, permanent credentials associated with your Delphix account would need to be configured in the values.yaml
file. To get these permanent credentials, visit the Hyperscale Compliance Download page and log in with your credentials. Once logged in, select the Hyperscale HELM Repository link and accept the Terms and Conditions. Once accepted, credentials for the docker image registry will be presented. Note them down and edit the imageCredentials.username
and imageCredentials.password
properties in the values.yaml
file as shown below:
# Credentials to fetch Docker images from Delphix internal repository
imageCredentials:
# Username to login to docker registry
username: <username>
# Password to login to docker registry
password: <password>
Delphix will delete unused credentials after 30 days and inactive (but previously used) credentials after 90 days.
Helm chart configuration files
hyperscale-helm
is the name of the folder that was extracted in the previous step. In the above directory structure, there are essentially two files that come into play while attempting to install the helm chart:
A
values.yaml
configuration file that contains configurable properties, common to all the services, with their default values.A
values-[connector-type].yaml
configuration file that contains configurable properties, applicable to the services of the specific connector, with their default values.
The following sections talk about some of the important properties that will need to be configured correctly for a successful deployment. A full list of the configurable properties can be found on the Configuration Settings page.
(Mandatory) Configure the staging area volume
A volume will need to be mounted, via persistent volume claims, inside the pods that will provide access to the staging area for the hyperscale compliance services. In Kubernetes deployment of Hyperscale, we can use NFS server or AWS S3 buckets for mounting staging area volume. This can be configured in the following ways that involves setting/overriding some properties in the values.yaml
configuration file:
Set the value for
stagingstorageType
as MOUNT(for NFS Server) or AWS_S3(for S3 bucket).If
stagingstorageType
is MOUNT, set values for the following properties if the cluster needs to mount an NFS shared path from an NFS server. For information about setting up and configuring an NFS server for the staging area volume, refer to NFS Server Installation.
nfsStorageHost
nfsStorageExportPath
nfsStorageMountOption
nfsStorageMountType
If
stagingstorageType
is AWS_S3, set values for the following properties as required if the cluster needs to bind the pods to a persistent volume using the S3 CSI driver. Refer to Configuring AWS S3 bucket as staging area for more details.
authMechanism
awsBucketName
awsBucketRegion
awsBucketPrefix
awsBucketDelimiter
awsAccessKey
awsSecretKey
stagingAwsS3SecretName
Installing the helm chart with these properties set will create a persistent volume on the cluster. As such, the user installing the helm chart should either be a cluster-admin or should have the privileges to be able to create persistent volume on the cluster. Otherwise, either of the following set of properties must also be set:
stagePvcName: Set this property if the cluster needs to bind the pods to a persistent volume claim. Note that until this PVC is bound to a backing PV, the pods will not start getting created and as such, the cluster admin should ensure that the backing PV is either statically provisioned or dynamically provisioned based on the storage class associated with PVC.
stagePvName and stageStorageClass: Set these properties if the cluster needs to bind the pods to a persistent volume with the associated storage class name. Once the helm chart installation starts, a PVC will be created that is managed by the helm. Note: stageStorageClass is not required for AWS S3.
The following properties are supporting/optional properties that can be overridden along with the above properties:
nfsStorageMountOption: If nfsStorageHost and nfsStorageExportPath have been set, set the appropriate mount option if you would like the cluster to mount with an option other than the default option of
nfsvers=4.2.
stageAccessMode and stageStorageSize: Persistent Volume claims can request specific storage capacity size and access modes.
(Mandatory for Oracle) Configure the instantclient volume
A volume will need to be mounted, via persistent volume claims, inside the Oracle load service that will provide access to Oracle’s instantclient binaries. This can be configured by one of the following ways that involves setting/overriding some properties in the values-oracle.yaml
configuration file:
nfsInstantClientHost and nfsInstantClientExportPath: Set values for these properties if the cluster needs to mount an NFS shared path from an NFS server.
Note: Installing the helm chart with these properties set will create a persistent volume on the cluster. As such, the user installing the helm chart should either be a cluster-admin or should have the privileges to be able to create persistent volume on the cluster.
instantClientPvcName: Set this property if the cluster needs to bind the pods to a persistent volume claim. Note that until this PVC is bound to a backing PV, the pods will not start getting created and as such, the cluster admin should ensure that the backing PV is either manually provisioned or dynamically provisioned based on the storage class associated with PVC.
instantClientPvName and instantClientStorageClass: Set these properties if the cluster needs to bind the pods to a persistent volume with the associated storage class name. Once the helm chart installation starts, a PVC will be created that is managed by the helm.
The following properties are supporting/optional properties that can be overridden along with the above properties:
instantClientMountOption: If nfsInstantClientHost and nfsInstantClientExportPath have been set, set the appropriate mount option if you would like the cluster to mount with an option other than the default option of
nfsvers=4.2.
instantClientAccessMode and instantClientStorageSize: Persistent Volume claims can request specific storage capacity size and access modes.
(Mandatory for File Connector) Configure the source and target connector type and (optionally) the source and target volumes
UnloadFSMount and loadFSMount (earlier unloadStorageType= FS and loadStorageType = FS):
To use the filesystem source and target connector types, you will need to configure persistent volumes using the nfsUnloadStorage options, then uncomment and set the values of unloadFSMount and loadFSMount to true. If these values are set to true, a volume will need to be mounted, via persistent volume claims, inside the file-connector unload service that will provide access to the source file location and inside the load service that will provide access to the target file location.
UnloadHadoopMount and loadHadoopMount (earlier unloadStorageType= Hadoop and loadStorageType = Hadoop):
To use the Hadoop source and target connector types and configure persistent volumes using the nfsHadoopStorage options, you will need to uncomment and set the values of unloadHadoopMount and loadHadoopMount to true. If these values are set to true, a volume will need to be mounted to add the Hadoop configuration files, via persistent volume claims, inside the file-connector unload and load service that will provide access to the Hadoop configuration files.
These can be configured in one of the following ways that involves setting/overriding some properties in the values-file-connector.yaml
configuration file:
nfsUnloadStorageHost, nfsUnloadStorageExportPath, nfsLoadStorageHost, and nfsLoadStorageExportPath: Set values for these properties if the cluster needs to mount an NFS shared path from an NFS server.
Note: Installing the helm chart with these properties set will create a persistent volume on the cluster. As such, the user installing the helm chart should either be a cluster admin or should have the privileges to be able to create persistent volume on the cluster.
unloadStoragePvcName and loadStoragePvcName: Set these properties if the cluster needs to bind the pods to a persistent volume claim. Note that until this PVC is bound to a backing PV, the pods will not start getting created and as such, the cluster admin should ensure that the backing PV is either manually provisioned or dynamically provisioned based on the storage class associated with PVC.
unloadStoragePvName, unloadStorageClass, loadStoragePvName, and loadStorageClass: Set these properties if the cluster needs to bind the pods to a persistent volume with the associated storage class name. Once the helm chart installation starts, a PVC will be created that is managed by the helm.
The following properties are supporting/optional properties that can be overridden along with the above properties:
unloadStorageMountOption and loadStorageMountOption: If
nfsUnloadStorageHost
,nfsUnloadStorageExportPath
,nfsLoadStorageHost
, andnfsLoadStorageExportPath
are configured, set the appropriate mount option that you would like the cluster to use to mount the storage option. Uncomment the line fornfsvers=4.2.
unloadStorageSize, unloadStorageAccessMode, loadStorageSize, and loadStorageAccessMode: Persistent Volume claims can request specific storage capacity size and access modes.
Optionally, if you would like to use PySpark as the data writer type, you may configure it under the unload and load service property values by uncommenting the line and setting dataWriterType: pyspark
.
To enable the staging push feature for unload and load services, set skipUnloadWriters
and skipLoadWriters
to true. Alternatively, you can provide the format file instead of using the staging push feature. To enable this option, set the userProvidedFormatFile
to true.
Note: Configurations such as dataWriterType, skipLoadWriters and userProvidedFormatFile can now be configured independently for each job using the source_configs and target_configs in job configuration.
(Optional) Configure the service database volumes
A volume will need to be mounted, via persistent volume claims, inside the pods that will provide the storage for the service databases for each hyperscale compliance service. By default, a persistent volume claim, using the default storage class, will be requested on the cluster. This can be configured, for some or all services, in one of the following ways that involves setting/overriding properties in the values.yaml
configuration file:
[service-name].dbPvcName: Set this property if the cluster needs to bind the pods to a persistent volume claim. Note that until this PVC is bound to a backing PV, the pods will not get created and as such, the cluster admin should ensure that the backing PV is either manually provisioned or dynamically provisioned based on the storage class associated with PVC. The service database names default to
controller-db
,unload-db
,masking-db
andload-db
for the controller, unload, masking and load services respectively.[service-name].databaseStorageSize: Set this property if the cluster should request a PVC with a storage size to something other than the pre-configured size.
storageClassName: Set this property if the cluster should request a PVC using a specific storage class.
(Optional) Configure the cluster node for each service
By default, pods will be scheduled on the node(s) determined by the cluster. Set a node name under the [service-name].nodeName property for the service(s) if you would like to request the cluster to schedule pods on particular node(s).
(Optional) Set resource requests and limits
Some users may have default container settings as part of their Kubernetes or OpenShift infrastructure management. Sometimes, it is important to alter those settings for Hyperscale containers. You can configure resource requests and limits for each Hyperscale container like the following:
controller:
resources:
requests:
memory: "256Mi"
cpu: "100m"
limit:
memory: "512Mi"
cpu: "500m"
The above example is only for controller service. You can can configure properties for other services (load, unload and masking) in the same way. Note, the example above includes sample values, and you may need to contact your infrastructure team to determine these values.
Install the Helm Chart
Once the desired properties have been set/overridden, proceed to install the helm chart by running:
helm install hyperscale-helm <directory path of the extracted chart> -f values-[connector-type].yaml
Check for the Successful Installation
After installing the helm chart, check the status of the helm chart and the pods using the following commands:
$ helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
hyperscale-helm default 1 2023-04-17 05:38:17.639357049 +0000 UTC deployed hyperscale-helm-18.0.0
$ kubectl get pods --namespace=hyperscale-services
NAME READY STATUS RESTARTS AGE
controller-service-65575b6458-2q9b4 1/1 Running 0 125m
load-service-5c644b9cc8-g9fs8 1/1 Running 0 125m
masking-service-7ddfd49c8f-5j2q5 1/1 Running 0 125m
proxy-5bd8d8f589-gkx8g 1/1 Running 0 125m
unload-service-55b5bd8cc8-7z95b 1/1 Running 0 125m
Creating ingress controller and ingress resource
After successfully deploying the Hyperscale Compliance services on the Kubernetes cluster, the final step involves creating an ingress route to manage external traffic to the services efficiently. For instructions, follow the steps as documented in the Ingress Setup.