There are a variety of qpks and SDKs that can be used to publish data to RT:
The publishers use these qpks and SDKs to record messages to a log file. These files are then replicated into the RT stream which merges them and then replicates the output stream to subscribers. Each publisher needs to be uniquely identified by the kdb Insights deployment it is publishing to which is done by setting a session name.
If a publisher requires an ingress point, such as a load balancer, in order to communicate with RT, we class this as an external publisher. External publishers are designed to connect to the kdb Insights Enterprise via an Information Service. This service provides the RT external load balancer endpoints and associated SSL certificate authority, certificate and key for a client which has enrolled via Keycloak.
The qpks and SDKs have the following characteristics:
- Includes a low-level log-writer.
- Writing messages to log files means that a loss of networking won't result in data loss (assuming the network does come back and the publisher has enough disk space to queue during the drop out).
- The log files are written to a specific directory (defined by the session name) on the publisher's host.
- The log file names have the format "log.X.Y" - where “X” is the session number and “Y” is the roll-index.
- The session name (and session directory) is preserved between runs, but is not shared by multiple publishers.
- The files are rolled every 1GB.
At startup, the SDKs do the following: 1. Local filesystem checks on the log file directory. 1. Create a new file with the next session number for the defined session. 1. Start the replicators needed to move data from the local log file directory to each of the RT nodes where they are merged.
Deduplication is available for publishers who are sending the same messages in the same sequence such that RT ensures only one copy of each message is sent to the consumers. This allows support for the failure of a publisher. The use cases for deduplication include:
- Failure and restart of a single publisher from the last checkpoint. This ensures any messages sent between the last publisher checkpoint and a failover are not passed down the stream twice.
- Multi node publishing of the same messages, to ensure high availability. This ensures only one copy of each message is sent down the stream but allows for one of the publishers to fail without data loss or duplication.
Garbage collection of publisher logs
Publisher log files are automatically garbage collected when both of the following conditions are met:
The publisher's log file has rolled, for example from
The rolled log, for example
log.0.0, has been replicated to all the RT pods and merged.
At this point
log.0.0 is garbage collected on each of the RT pods. An archived flag is then propagated back to publisher so its local
log.0.0 is then also garbage collected.
Publisher logs upon publisher shutdown and restart
Before a publisher can shut down, the RT log file that was being written to needs to be fully replicated from the publisher node to the RT cluster. This is to ensure no messages are lost. When the publisher is first started, there'll be a number of log files created. Details on these logs files are expanded upon on the About page.
[nobody@rt-pub-0 /]# ls -al total 84 drwxr--r-- 2 nobody nobody 4096 Dec 13 17:44 . drwxr-xr-x 5 nobody nobody 4096 Dec 13 17:44 .. -rw-r--r-- 1 nobody nobody 8 Dec 13 17:44 .session -rw-r--r-- 1 nobody nobody 67287 Dec 13 17:45 log.0.0 -rw-r--r-- 1 nobody nobody 8 Dec 13 17:44 state.0
Upon restarting the publisher, the publisher will no longer write to
log.0.0 but instead a new log will be created that the publisher will write to
The sticky bit is set on the previous log
log.0.0 indicating it has been garbage collected.
[nobody@rt-pub-0 /]# ls -al total 60 drwxr--r-- 2 nobody nobody 4096 Dec 13 17:47 . drwxr-xr-x 5 nobody nobody 4096 Dec 13 17:44 .. -rw-r--r-- 1 nobody nobody 8 Dec 13 17:47 .session -rw-r--r-T 1 nobody nobody 0 Dec 13 17:47 log.0.0 -rw-r--r-- 1 nobody nobody 44603 Dec 13 17:48 log.1.0 -rw-r--r-T 1 nobody nobody 0 Dec 13 17:47 state.0 -rw-r--r-- 1 nobody nobody 32 Dec 13 17:47 state.1