Data Access Processes Introduction
kdb Insights Data Access Processes (DAPs) provides read-only access to all data stored in a database, regardless of where the data currently resides in the data lifecycle.
For example, when data is submitted to the internal messaging fabric it is considered “in the database”, and should immediately become accessible to queries (potentially barring short-lived race conditions). This component will provide that access.
As data ages, its location of residence will change. For example, data may migrate to a short-term storage cache, then to longer-term storage (similar to an on-disk HDB), perhaps into even deeper storage (e.g. compressed data on slower disks, etc.). The Data Access component will provide continual access to this data as well.
In short, the component is responsible for ensuring that all data is always available to support incoming queries.
All queries presented to the component assumed to have been routed via a gateway component, either the kdb Insights Service Gateway or something custom.
It assumes it will receive and react to notifications from the component that holds responsibility for writedown of data (whether that is the kdb Insights Storage Manager, or a custom equivalent), and adjusting the responsibilities for data access within the suite of its processes.
It is similarly responsible for communicating with a type of gateway component to ensure that a current view of the data access topology is maintained to allow for intelligent routing and target selection decisions.
It can be thought of as replacing the RDB, IDB, HDB processes of a “traditional” kdb+ architecture. However, this component differs from those traditional processes in some crucial ways:
- All processes share a common code base, regardless of the locale of residence for the data in the underlying database. To the extent that a particular instance may be thought of functioning as an RDB or HDB, the distinction is one established via configuration and start-up context.
- All processes present a common view of the data. In general, any query that can run on one process can run on any other.
In addition to the processes responsible for providing data access, this component will also be responsible for providing the definition and implementation of one or more general, external query APIs (or services). For example, a general API whose arguments describe the table from which to retrieve data, the IDs or other attributes used to select the data along with the column names to which those would apply, and any additional aggregating steps to be applied to the selected data. Currently, it supports a simple generic retrieval API with basic aggregation. This will be extended in future releases, along with the addition of other APIs.
The component will support a mechanism for the introduction of more specific queries (e.g. application-specific queries that depend on a particular data model), but currently does not.