Backup and restore
The database stores all of the business-critical data in your kdb Insights application. It is recommended that backups are taken of the database periodically so that in the event of system failure, the database can be recovered. This page is a guide for creating snapshots of your database content for future recovery.
Data layout
The Storage Manager is responsible for storing data within different tiers based on your database's configuration. To create a backup of your entire system, a backup must be taken for each non-memory tier (ex. IDB, HDB, etc.).
Backup logs
Ensure that you backup your streaming logs in combination with your database backup. If you are using Reliable Transport, ensure that your archiver is configured to create appropriate backups. If you are using kdb-tick, you can follow this white paper for configuring appropriate log backups. You need to ensure you have enough logs so that your database backup can replay any data between the last backup and the point of failure.
Offline backup
Backing up deployment
In kdb Insights Enterprise, the CLI can be used to perform a backup and restore backup and restore of a full deployment. This guide specifically provides the instructions and related caveats for backing up a database only. In this case, "database" refers to the IDB and HDB tiers of your database. RDB data cannot be backed up through this method because it does not remain static. However, RDB data is still recoverable because it is fully captured in RT logs that can be replayed to restore the data.
The simplest and most comprehensive form of backup is to perform an offline backup when your system is not ingesting any data. This ensures a completely static database for a consistent backup. An offline backup involves stopping your running system at the appropriate time during data processing (once all EOI and EOD operations are complete), creating a copy of your data folder on each tier, and then restarting the system. The one exception to this is batch ingest, which could mutate the data on disk outside of an EOD operation. Do not run any batch ingests during a backup.
Running an offline backup may not be an option for systems that have 24/7 up-time requirements. See online backup below for further details on using the snapshot functionality.
Stopping after an end of day writedown
To ensure that all data has been written to disk and all tiers are in a consistent state, a backup should be taken after an EOD completes. This can be checked by reviewing the EOI and EOD process logs, looking for the log message EOD complete, elapsed <time>
. This ensures that the maximum amount of data has been written to disk to minimize recovery. The exact timing of the backup does not matter, but it is best to do once the end-of-day writedown has completed. To ensure a consistent state, it is critical to ensure the duration of your backup is less than that of the time between two sequential EOD operations.
Begin by stopping the running assembly that you want to create a backup of.
kubectl delete asm $ASSEMBLY_NAME
With the system offline, we can now create a backup of the data for each tier. This example uses the following mounts and tier configuration:
Mounts
mounts:
rdb:
type: stream
baseURI: file://stream
partition: none
idb:
type: local
baseURI: file:///mnt/data/db/idb
partition: ordinal
hdb:
type: local
baseURI: file:///mnt/data/db/hdb
partition: date
Tiers
tiers:
- name: stream
mount: rdb
- name: idb
mount: idb
schedule:
freq: 0D00:10:00 # every 10 minutes
- name: hdb1
mount: hdb
schedule:
freq: 1D00:00:00 # every day
snap: 01:35:00 # at 1:35 AM
retain:
time: 2 days
- name: hdb2
mount: hdb
store: file:///mnt/data/db/hdbtier2
retain:
time: 5 weeks
- name: hdb3
mount: hdb
store: file:///mnt/data/db/hdbtier3
retain:
time: 3 months
Once the assembly has been stopped, a backup can be taken of each mount in the tiers configuration.
To create a backup, we need to mount the backing PVCs to create a copy of the data. A sample pod configuration is provided below to mount a tier mount so a backup can be created. The example must be modified to set the claimName
to point to the correct persistent claim.
Tier claim name
In kdb Insights Enterprise, your default tier claim name will be the name of your assembly concatenated with the tier, separated by a hyphen. For example, if my assembly was titled finance
and I have a tier called idb
, my IDB tier claim name will be finance-idb
.
apiVersion: v1
kind: Pod
metadata:
name: backup-pod
spec:
containers:
- name: ubuntu
image: ubuntu
tty: true
stdin: true
volumeMounts:
- mountPath: /data
name: database-volume
volumes:
- name: database-volume
persistentVolumeClaim:
claimName: myasm-idb # Set this to be the PVC of the tier you want to create a backup of
Deploy the backup pod.
kubectl apply -f backup-pod
Create a backup archive of the data folder. This backup will contain a number of symbolic links that need to be preserved, so an archive is created before copying to ensure they are preserved.
kubectl exec backup-pod -- sh -c "tar czf /data/backup.tar.gz /data/*"
The backup can then be downloaded using the following:
kubectl cp backup-pod:/data/backup.tar.gz backup.tar.gz
The backup.tar.gz
now contains a complete backup of a single tier. Repeat this process for each tier in your configuration.
Online backup
Backups can be taken for a running system by taking a snapshot of the database. A snapshot is a point-in-time view of the database created with hard links. Taking a snapshot is a synchronous operation that suspends EOI and EODs while the snapshot is being taken. This ensures no data is changed during the snapshot, thereby corrupting the backup.
The snapshot API can be accessed by connecting to the Storage Manager directly and running the snapshot REST API.
Storage Manager address
In the example below, $sm
is the host and port of the Storage Manager that administers the on-disk data you are backing up. This can be accessed directly using the Storage Manager service name and port. The service name is the name of your deployed assembly with -sm
as a suffix, and the default port for SM is 10001
. For example, if my assembly was called trades
, my address would be trades-sm:10001
.
POST http://$sm/snapshot
This API creates a backup for each tier on the database and returns the tier name and storage location.
[
{
"tier": "idb",
"snapRoot": "/data/db/idb/snapshot/20230723160308643935166",
"inventory": "/data/db/idb/snapshot/20230723160308643935166/inventory"
},
{
"tier": "hdb",
"snapRoot": "/data/db/hdb/snapshot/20230723160308643935166",
"inventory": "/data/db/hdb/snapshot/20230723160308643935166/inventory"
}
]
To complete the backup, copy the contents of the snapRoot
location to a safe storage location that can be used for recovery. To restore SM from a snapshot, copy the snapshot snapRoot
to the data directory of each tier and restart SM.