Delete Rows in a Table - BETA
This page describes how to delete rows of data from a table at rest.
Beta - For evaluation and trial use only
Delete rows in a table is currently in beta. Take note of the following:
- Deleting data from a database is permanent. We recommend that you take a backup of the database before performing a delete.
- Deleting rows in a table doees not apply to data in memory, such as in the RDB.
- Deleting data from the object store tier is not supported. When the date interval defined by
startTS
/endTS
overlaps with dates in the object store, the whole delete action is rejected. - Attempting to undo a delete could, in some instances, not complete successfully.
- Refer to the standard terms.
- We invite you to use this beta feature and to provide feedback using the portal.
Introduction
kdb Insights provides a mechanism to delete rows from a table where it resides at rest. This applies to data stored on IDB and HDB, but not RDB.
Also, note that this is targeting deleting rows from a table as opposed to either deleting the complete table or deleting a column from a table both of which should be handled using a schema change.
Delete is permanent
Deleting data from a database is permanent. We recommend that you take a backup of the database before performing a delete.
Before applying the delete operation, we recommend using getData
to test that the filter matches exactly the unwanted rows. Also, after delete, verify the remaining data.
Prerequisites
Entitlements are not in scope for kdb Insights SDK.
Performance
Deleting data from the historical tier can be a long running operation because it needs to read and write data that resides on disk. A few scenarios:
- Deleting all data in a table for a particular day is very fast due to the data being removed is on-disk
- Deleting a section of a table, for example, deleting
AAPL
from a quote table, is much slower since it needs to amend each column on disk whilst also filtering outAAPL
rows.
Best practice tip
Should your delete operation span multiple dates, it is advisable to delete data from one date first to ascertain how long this takes and then proceed to delete the remaining dates in groups. If other operations are in progress on your platform, for example, batch ingests and end of day operations, bear in mind that these will be queued behind the delete operation.
Similarly, since running delete is a memory intensive operation, it is advisable to batch smaller deletes together where possible and run as a single delete session.
Note that the speed with which the delete is performed depends on several variables such as RAM and CPU assigned to the SM, as well as compression settings among other factors.
Late Data
Late data is ON by default in kdb Insights Enterprise and depending on the nature or your data, there could be data held in the memory of the HDB DAP process. Since the delete API only targets data on disk then it may be necessary to first perform a manual EOD operation to make sure all data is flushed to disk before running the delete operation.
Perform a delete
A delete can be performed by REST directly on the Service Gateway.
Hostname
In the example below, $SG
is the hostname and port of the Service Gateway process on which to perform the delete operation.
Enter the following code to perform a delete:
curl -X POST "$SG/delete" \
-H 'Content-Type: application/json' \
-d '{"session":"","table":"","startTS":"","endTS":"","filter":[]}'
Example success response
HTTP/1.1 200
Content-Type: application/json
Connection: close
Content-Length: 99
{"name":"54f5cf87-dbc2-4336-d109-173ed56c0b86","updtype":"delete","status":"processing","error":[]}
Example failure response
HTTP/1.1 400
Content-Type: application/json
Connection: close
Content-Length: 48
{"code":"400","text":"missing","details":"body"}
Example audit log
The example audit log below shows the information included. Refer to audit log for more details.
2025-04-02 11:02:57.030 [Audit] INFO Executing api=batchUpdDelete auditID=example_audit_id table=trade startTS=2025.04.03D00:00:00.000000000 endTS=2025.04.03D23:59:59.999999999 filter=()
Delete row parameters
The following table describes the supported parameters when executing a delete command:
Name | Required | Type | Description |
---|---|---|---|
session |
no | string | An optional delete session name, which can be used to undo the delete. If you do not provide a session name, one is generated and returned. |
table |
yes | string | The name of the table to perform this delete operation on. |
startTS |
no | timestamp | The inclusive start boundary for performing this delete operation. |
endTS |
no | timestamp | The exclusive end boundary for performing this delete operation. Read below for how to specify a timestamp. Since deleting from the object store tier is not supported, the time window must restrict the delete to the dates in the local HDB tiers. |
filter |
yes | list | A list of triples used for selecting specific data to be deleted. If a filter is provided, any data that matches the filter is deleted. If filter is [], all data between the start and end timestamps is deleted. Read filtering below for more details. |
Warning
The whole deletion is rejected if the time window overlaps with dates in object store.
Specify timestamps
To retrieve the start and end times using bash
, use date -u
to print formatted dates:
# Define a time interval within the hour
startTS=$(date -u '+%Y.%m.%dD%H:00:00')
endTS=$(date -u '+%Y.%m.%dD%H:%M%:%S')
Times may be extended to contain a variable number of trailing digits up to nanosecond precision.
Filtering
Filtering can be used to delete specific data that matches the filter. To find out more about how filtering works, refer to detailed information in Get Data filtering.
Session status
To check the status of a delete session in progress, you can send a GET request to the SG:
curl "$SG/delete/54f5cf87-dbc2-4336-d109-173ed56c0b86"
Example response
{"name":"54f5cf87-dbc2-4336-d109-173ed56c0b86","pipeline":"","database":"Example assembly","updtype":"delete","status":"completed","details":[],"tbls":["trade"],"dates":["2024-12-08","2024-12-09"],"progress":{"cmdCurrent":"","cmdIndex":2,"cmdTotal":2,"subCurrent":"","subIndex":0,"subTotal":0},"error":"","updated":"2024-12-10T10:47:27.461244645"}
The status
field progresses through the following stages:
pending
queued
processing
completed
If the status is errored
, the delete operation was not performed and all rows remain in the database.
Rollback a delete
Performing a delete is an immediate action considered irreversible. We recommend that you create a backup of your database before performing a delete. Additionally, all RT logs since the time of issuing the delete must also be available for a successful restoration of all data.
When a delete is performed, it is sequenced in the source stream's control messages. If you try to undo a delete by restoring your database from a backup, the delete is replayed and the matching data is deleted again. To avoid this during a data recovery, the session of the delete can be ignored so that the delete is not performed and the data remains. To ignore a delete session during a recovery operation, set the following environment variable KXI_IGNORE_DELETE=<session>[,<session>,...]
.