Upgrade kdb Insights Enterprise on Azure

This page explains how to upgrade kdb Insights Enterprise, and your Kubernetes version on Azure.

KX Managed

If you deployed kdb Insights Enterprise using the Azure Marketplace the Insights Managed Service (IMS) team will reach out to you to arrange a maintenance window for upgrades. IMS takes care of upgrading the kdb Insights Enterprise, the third party dependencies and the kubernetes version of your AKS cluster.

If you have any questions please raise a ticket.

Rook-Ceph

If you deployed kdb Insights Enterprise through the Azure Marketplace, whether using the Managed or Unmanaged offering, upgrading Rook Ceph from version 1.17 to 1.18 or later may result in known issues. Please refer to the upgrade guidance provided at the bottom of this page before proceeding.

Azure Marketplace Monitoring Stack

Azure Marketplace deployments could use their own monitoring stack, relying on Azure Monitor and Azure Log Analytics. We recommend you use Prometheus and Grafana which can be installed using the kxi install monitoring command. When changing monitoring stack be sure to disable the Azure monitoring stack by executing:

az aks disable-addons -a monitoring -n "$CLUSTER_NAME" -g "$RESOURCE_GROUP"

Upgrade AKS cluster

Rook-Ceph

If you have deployed rook-ceph, please do not use the Azure AKS rolling upgrade feature. There are known issues with this. Please reach out to KX Support team for assistance.

Upgrade Kubernetes on kdb Insights Enterprise on Azure

This section explains how to upgrade your Kubernetes version, while keeping your kdb Insights Enterprise running on Azure.

The Kubernetes version can be upgraded automatically or manually. Learn more about upgrading your AKS cluster

Prerequisite

To perform a Kubernetes upgrade for kdb Insights Enterprise version 1.12.0 or higher, the following manual process needs to be performed (regardless of your upgrade choice - automated or manual).

Check that you have sufficient CPU quota available in your Azure subscription to create a new node per each node pool.

The Istio ReplicaSet needs to be scaled from 1 to 3:

Connect to the kdb Insights Enterprise cluster on Azure.
```
az account set --subscription <SUBSCRIPTION>
```

Get Azure cluster credentials.

az aks get-credentials --resource-group <RESOURCE GROUP> --name <KXI KUBERNETES CLUSTER NAME> --overwrite-existing

Set the minReplicas value to 3 for the Istio HorizontalPodAutoscaler (HPA).

kubectl patch hpa istiod -n istio-system --type=merge -p '{"spec":{"minReplicas":3}}'

To prevent the upgrade from becoming stuck, increase the Pod disruption budgets (PDBs) from 0 to 1 where applicable:
1. Set the MODIFIED_PDBS_FILE environment variable to specify the file path where all patched PDBs are stored. If this variable is not set, the default path of /tmp/modified_pdbs.txt is used.
2. Download the patch_pdb.sh script file which updates PDBs to allow pod disruption during the upgrade.
3. Navigate to the folder where the script was downloaded, make it executable, and run it.
```
chmod +x patch_pdb.sh
./patch_pdb.sh
```

Upgrade

If the cluster requires a manual update, do the following:
- Read the documentation: Upgrade an AKS cluster.
- Access the Azure portal.
- Navigate to kdb Insights Enterprise Azure Kubernetes Service > Settings > Cluster Configuration.
- Click Upgrade version to trigger the manual update of Kubernetes.
Monitor the progress of the upgrade:
- Check the node pools.
- Check the workloads.

Check if all nodes run the desired version.

kubectl get nodes -o json | jq -r '.items[] | "\(.status.nodeInfo.kubeletVersion) \(.metadata.name)"'

Post-Upgrade (optional)

Set the minReplicas value back to 1 for the Istio HPA:

kubectl patch hpa istiod -n istio-system --type=merge -p '{"spec":{"minReplicas":1}}'

If the post-upgrade tasks are run in a different shell than the prerequisite tasks, set the MODIFIED_PDBS_FILE environment variable to the same value that was used before. If this variable is not set, the default path of /tmp/modified_pdbs.txt is used.
Download the revert_pdb.sh script file which reverts the PDBs to disallow pod disruption.
Navigate to the folder where the script was downloaded, make it executable, and run it.
```
chmod +x revert_pdb.sh
./revert_pdb.sh
```

Upgrade cert-manager on Azure using helm

Get the recommended version of cert-manager from the kdb Insights Enterprise release page.

Upgrade cert-manager using Helm:

CERT_MANAGER_VERSION="<Your Version>"

helm upgrade -n cert-manager --install "cert-manager" "oci://quay.io/jetstack/charts/cert-manager" --version "$CERT_MANAGER_VERSION" --reuse-values

Verify that the cert-manager pods are running:

kubectl get pods -l app.kubernetes.io/instance=cert-manager

Upgrade ingress-nginx on Azure using helm

Get the recommended version of ingress-nginx from the kdb Insights Enterprise release page.

Add the ingress-nginx Helm repo to your Helm repositories:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

Upgrade using Helm:

INGRESS_NGINX_VERSION="<Your Version>"

helm upgrade -n ingress-nginx --install "ingress-nginx" "ingress-nginx/ingress-nginx" --version "$INGRESS_NGINX_VERSION" --reuse-values

Verify that the pods are running:

kubectl get pods -l app.kubernetes.io/instance=ingress-nginx

Upgrade nginx-community on Azure using Helm

Get the recommended version of nginx-community from the kdb Insights Enterprise release page.

Add the nginx-stable Helm repo to your Helm repositories:

helm repo add nginx-stable https://helm.nginx.com/stable
helm repo update

Upgrade using Helm:

NGINX_VERSION="<Your Version>"

helm upgrade -n nginx-community --install "nginx-community" "nginx-stable/nginx-ingress" --version "$NGINX_VERSION" --reuse-values

Verify that the pods are running:

kubectl get pods -l app.kubernetes.io/instance=nginx-community

Upgrade Istio on Azure

Istio and its version are managed by the encryption-in-flight management service task. To upgrade it, upgrade the insights-on-k8s Helm chart version on your cluster:

Install the kxi-cli. Refer to the CLI install instructions
Connect to your cluster using the CLI. Follow the configuration and authentication guides.

Upgrade insights-on-k8s to the latest:

kxi --debug install upgrade --version <your current insights version>

To specify a particular insights-on-k8s version:

kxi --debug install upgrade --version <your current insights version> --insights-on-k8s-version <specify insights on k8s version>

The encryption in-flight install task is triggered automatically during installation. Verify that it completed successfully:
```
kubectl get tasks --namespace kxi-management
```
Look for the most recent encryption-in-flight task with status TaskFinished, for example:
```
deploy-encryption-in-flight-task-3   encryption-in-flight   deploy            TaskFinished   3m
```
Istio is upgraded to the latest supported version via the encryption-in-flight task. Confirm the Istio version:
```
helm ls -n istio-system
```

Upgrade rook-ceph on Azure via Helm

For rook-ceph upgrades, follow the official upgrade guide.

Note

The steps below apply generally, but always follow any version-specific instructions in the official documentation.

Add the rook Helm repo to your Helm repositories:

helm repo add rook https://charts.rook.io/release

Get the recommended version of rook-ceph from the kdb Insights Enterprise release notes
```
DESIRED_ROOK_VERSION=<version>
ROOK_CHARTPATH=rook/rook-ceph
```

Set the namespace:

kubectl config set-context --current --namespace=rook-ceph

Upgrade rook-ceph:

helm upgrade --install "rook-ceph" "$ROOK_CHARTPATH" --version "$DESIRED_ROOK_VERSION" --reuse-values

Verify that all pods are running by listing all pods in the rook-ceph namespace:
```
kubectl get pods -n rook-ceph
```

Upgrade Rook-Ceph to 1.18+ on Azure

When upgrading Rook Ceph alongside the main upgrade, the following issue may occur and requires manual intervention.

Issue: The CSI driver tolerations may not be properly configured after upgrading Rook Ceph, potentially causing scheduling issues on certain nodes.

Resolution: Apply the following patch to the Rook Ceph CSI driver:

kubectl patch driver.csi.ceph.io rook-ceph.cephfs.csi.ceph.com -n rook-ceph --type='merge' -p '
{
  "spec": {
    "nodePlugin": {
      "tolerations": [
        {
          "effect": "NoSchedule",
          "key": "CriticalAddonsOnly",
          "operator": "Exists"
        }
      ]
    }
  }
}'

This command adds the required toleration to allow the CSI node plugin to run on nodes with the CriticalAddonsOnly taint.

Switch to F5 Ingress from ingress-nginx

As ingress-nginx is being retired, you should migrate to the F5 NGINX Ingress Controller, starting with version 1.17.7. From version 1.19.0 onwards, F5 NGINX becomes the default ingress controller for all new installations and is the recommended option going forward.

F5 NGINX on 1.18

If you switched to F5 ingress, then future upgrades must be to 1.18.3 or later. 1.18.0, 1.18.1 and 1.18.2 do not support F5 ingress.

Prerequisites

Before beginning the migration, ensure that:

helm and yq (version v4.45.1 or above) utilities are installed
You have sufficient cluster permissions to create namespaces, deploy resources, and uninstall existing controllers
Your current installation uses ingress-nginx as the ingress controller
Upgrade your cluster to 1.17.7 or above

Migration Steps

The migration procedure varies based on your installation configuration.

For deployments installed with a public IP, follow the steps in the Public IP section.
For deployments using a private IP, follow the Private IP section.

Firstly, add the helm repository to your helm repositories:

helm repo add nginx-stable https://helm.nginx.com/stable
helm repo update

Public IPPrivate IP

Create the base configuration file f5-values-default.yaml with the recommended settings for the F5 NGINX Ingress Controller:

controller:
  kind: daemonset
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      memory: 512Mi
  ingressClass:
    create: true
    name: nginx-community
    setAsDefaultIngress: true
  enableSnippets: true
  nodeSelector:
    agentpool: system

Extract and preserve the relevant service and toleration configurations from your existing ingress-nginx deployment. This ensures that node affinity settings and service configurations are carried over to the new controller:
```
helm get values -n ingress-nginx ingress-nginx | yq '{"controller": {"service": .controller.service, "tolerations": .controller.tolerations}}' > f5-overlay.yaml
```

Deploy the F5 NGINX Ingress Controller using Helm. The controller will be installed in a dedicated namespace and will coexist with ingress-nginx during the transition period:

helm upgrade --install nginx-community nginx-stable/nginx-ingress \
--version 2.4.1 \
-n nginx-community \
-f ./f5-values-default.yaml \
-f ./f5-overlay.yaml \
--create-namespace

Update insights-values to configure kdb Insights Enterprise to use the new F5 NGINX ingress controller. This modifies the global ingress settings to point to the newly deployed controller:
```
helm get values insights -n insights -o yaml | \
yq '. * {"global": {"ingress": {"controllerType": "f5-nginx", "class": "nginx-community"}}}' \
> insights-vals.yaml
```

Execute the kdb Insights Enterprise upgrade with the updated configuration:

kxi install upgrade --version <your insights version> --skip-packages -f ./insights-vals.yaml

Create a backup of your existing ingress-nginx configuration before proceeding. This allows for recovery if any issues arise during the migration:
```
helm get values -n ingress-nginx ingress-nginx > ingress-nginx-values-backup.yaml
```
Once the kdb Insights Enterprise upgrade has completed successfully, remove the legacy ingress-nginx controller from the cluster:
```
helm uninstall ingress-nginx -n ingress-nginx
```
Allow time for the F5 Ingress Controller to fully take over ingress traffic. This transition may take up to 5 minutes. Monitor progress by checking the services in the nginx-community namespace - ensure all services reach a stable running state and that none remain in a pending state.
After the deployment has stabilized, access kdb Insights Enterprise and redeploy your packages. The CLI outputs the required commands upon completion of the upgrade process:
```
Packages and assemblies were not automatically re-applied.
To manually re-apply them, run: kxi pm deploy
Packages: ['example-package']
```
If you have deployed the Monitoring Stack and a Grafana ingress, also follow these migration steps.

Create the base configuration file external-f5-values-default.yaml with the recommended settings for the F5 NGINX Ingress Controller:

controller:
  kind: daemonset
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      memory: 512Mi
  ingressClass:
    create: true
    name: external-nginx-community
    setAsDefaultIngress: false
  enableSnippets: true
  nodeSelector:
    agentpool: system

Extract and preserve the relevant service and toleration configurations from your existing external-ingress-nginx deployment. This ensures that node affinity settings and service configurations are carried over to the new controller:
```
helm get values -n external-ingress-nginx external-ingress-nginx | yq '{"controller": {"service": .controller.service, "tolerations": .controller.tolerations}}' > external-f5-overlay.yaml
```

Deploy the F5 NGINX Ingress Controller using Helm. The controller will be installed in a dedicated namespace and will coexist with external-ingress-nginx during the transition period:

helm upgrade --install external-nginx-community nginx-stable/nginx-ingress \
--version 2.4.1 \
-n external-nginx-community \
-f ./external-f5-values-default.yaml \
-f ./external-f5-overlay.yaml \
--create-namespace

Create the base configuration file f5-values-default.yaml with the recommended settings for the F5 NGINX Ingress Controller:

controller:
  kind: daemonset
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      memory: 512Mi
  ingressClass:
    create: true
    name: nginx-community
    setAsDefaultIngress: true
  enableSnippets: true
  nodeSelector:
    agentpool: system

Extract and preserve the relevant service and toleration configurations from your existing ingress-nginx deployment. This ensures that node affinity settings and service configurations are carried over to the new controller:
```
helm get values -n ingress-nginx ingress-nginx | yq '{"controller": {"service": .controller.service, "tolerations": .controller.tolerations}}' > f5-overlay.yaml
```

Deploy the F5 NGINX Ingress Controller using Helm. The controller will be installed in a dedicated namespace and will coexist with ingress-nginx during the transition period:

helm upgrade --install nginx-community nginx-stable/nginx-ingress \
--version 2.4.1 \
-n nginx-community \
-f ./f5-values-default.yaml \
-f ./f5-overlay.yaml \
--create-namespace

Update your insights-values to configure kdb Insights Enterprise to use the new F5 NGINX ingress controller. This modifies the global ingress settings to point to the newly deployed controller:
```
helm get values insights -n insights -o yaml | \
yq '. * {"global": {"ingress": {"controllerType": "f5-nginx", "class": "nginx-community"}}}' \
> insights-vals.yaml
```

Execute the kdb Insights Enterprise upgrade with the updated configuration:

kxi install upgrade --version <your insights version> --skip-packages -f ./insights-vals.yaml

Create a backup of your existing ingress-nginx configuration before proceeding. This allows for recovery if any issues arise during the migration:

helm get values -n ingress-nginx ingress-nginx > ingress-nginx-values-backup.yaml
helm get values -n external-ingress-nginx external-ingress-nginx > external-ingress-nginx-values-backup.yaml

Once the kdb Insights Enterprise upgrade has completed successfully, remove the legacy ingress-nginx controller from the cluster:
```
helm uninstall ingress-nginx -n ingress-nginx
helm uninstall external-ingress-nginx -n external-ingress-nginx
```
Allow time for the F5 Ingress Controller to fully take over ingress traffic. This transition may take up to 5 minutes. Monitor progress by checking the services in the nginx-community namespace - ensure all services reach a stable running state and that none remain in a pending state.
After the deployment has stabilized, access kdb Insights Enterprise and redeploy your packages. The CLI outputs the required commands upon completion of the upgrade process:
```
Packages and assemblies were not automatically re-applied.
To manually re-apply them, run: kxi pm deploy
Packages: ['example-package']
```
If you have deployed the Monitoring Stack and a Grafana ingress, also follow these migration steps.

Rollback to Ingress-Nginx

If you need to revert to Ingress-Nginx for any reason, follow these steps:

Reinstall Ingress-Nginx using the backup values.

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

helm upgrade --install \
    -f "ingress-nginx-values-backup.yaml" \
    --namespace "ingress-nginx" \
    "ingress-nginx" "ingress-nginx/ingress-nginx" --version "4.11.5" \
    --create-namespace

If your cluster uses a private IP address, also perform the following:

helm upgrade --install \
    -f "external-ingress-nginx-values-backup.yaml" \
    --namespace "external-ingress-nginx" \
    "external-ingress-nginx" "ingress-nginx/ingress-nginx" --version "4.11.5" \
    --create-namespace

Reconfigure your Insights instance:

helm get values insights -n insights -o yaml | \
    yq '. * {"global": {"ingress": {"controllerType": "community-nginx", "class": "nginx"}}}' \
    > insights-vals.yaml


kxi install upgrade --version <your insights version>  --skip-packages -f ./insights-vals.yaml

Once the configuration is updated, run:

helm uninstall nginx-community -n nginx-community

Allow time for the F5 Ingress Controller to fully take over ingress traffic. This transition may take up to 5 minutes. Monitor progress by checking the services in the nginx-community namespace - ensure all services reach a stable running state and that none remain in a pending state.
Once the deployment has stabilized, access kdb Insights Enterprise and redeploy your packages. The CLI outputs the required commands at the end of the upgrade process:
```
Packages and assemblies were not automatically re-applied.
To manually re-apply them, run: kxi pm deploy
Packages: ['example-package']
```
If you have deployed the Monitoring Stack and a Grafana ingress, also follow these migration steps.