Late Data Best Practices
When setting up your kdb Insights Database for late data, there are things to consider beyond the "how to enable". Here are some common pitfalls and tips to consider when setting up for late data.
1) Ensure that the local DAPs have access to enough memory to store late data.
When late data is enabled, the IDB and HDB will store in-purview data received from the stream in memory until the next EOX event that allows them to purge it and read it from disk when needed. To do this, they will need enough RAM available to keep the data in memory, while still being able to service queries.
An important point to keep in mind when estimating the memory required is whether the system is configured with single mount DAPs or multi mount DAPs, see query configuration for more details. To size appropriately, you need to know the ingestion rate and expectations on how old the data ingested is. The RDB will hold data ingested since the last EOI, the IDB will hold in memory data ingested with timestamps between the last EOD and the last EOI, and HDB will hold data in memory data that has a timestamp older than the last EOD time.
In cases where the ingestion rate is known but the time range of the data is unknown or varies significantly, the multi mount DAP may be easier to size since you can size the whole container that encapsulates all mounts and not worry about which particular DAP the data is in.
2) Set pctMemThreshold
such that RDB and IDB can react to an unexpected flood of data.
The pctMemThreshold
is a number between 0 and 1, representing how much the DAP should allow table records to occupy its in-memory cache. This pctMemThreshold
is converted to a record count maxRecordIntv
the DAP expects it can ingest before hitting that cap. When the DAP has ingested maxRecordIntv
records within an interval, then an emergency EOI will be triggered to save the process from running out of memory.
3) In a situation where there is a large influx of HDB purview data, manually trigger emergency EODs.
If the amount of late data ingested during the day exceeds the available HDB memory, an early EOD writedown needs to be triggered manually, possibly more than once. If this is not done (or if done too late), the HDB enters low memory mode and will not ingest any additional data until the next reload
. When in this state, queries to the HDB return an AC code of .kxi.response.ac.MEMORY
, and the ai
contains information about the number of records that were ignored while in this low memory state.
4) If setting up an object storage tier, ensure that data is never as late as the data in the object tier.
Currently the Storage Manager does not support the writedown of late data updates to an object storage tier, so any data ingested that is destined for the object tier will be unqueryable after the next EOD.