Skip to content

Metrics Reference

This page lists the metrics generated by the Platform. See Metrics for details on how metrics are captured, stored and presented. Each component generates a range of metrics. With metrics enabled on a component, kdb info, memory and CPU metrics are generated.

kxi-controller

Each call to a kxi_controller API endpoint generates metrics.

schema

Relates to the management of schemas from the UI. Available metrics are:

metric name type description
kxi_kxic_schema_list_histogram_seconds histogram Count and time taken to list schemas
kxi_kxic_schema_list_failure_total counter Number of failed calls to list schemas
kxi_kxic_schema_create_histogram_seconds histogram Count and time taken to create schemas
kxi_kxic_schema_create_failure_total counter Number of failed calls to create schemas
kxi_kxic_schema_get_histogram_seconds histogram Count and time taken to get a schema by ID
kxi_kxic_schema_get_failure_total counter Number of failed calls to get a schema by ID
kxi_kxic_schema_update_histogram_seconds histogram Count and time taken to update schemas
kxi_kxic_schema_update_failure_total counter Number of failed calls to update schemas
kxi_kxic_schema_delete_histogram_seconds histogram Count and time taken to delete schemas
kxi_kxic_schema_delete_failure_total counter Number of failed calls to delete schemas
kxi_kxic_schema_count gauge Number of existing schemas

database

Relates to the management of databases from the UI. Available metrics are:

metric name type description
kxi_kxic_db_list_histogram_seconds histogram Count and time taken to list databases
kxi_kxic_db_list_failure_total counter Number of failed calls to list databases
kxi_kxic_db_create_histogram_seconds histogram Count and time taken to create databases
kxi_kxic_db_create_failure_total counter Number of failed calls to create databases
kxi_kxic_db_get_histogram_seconds histogram Count and time taken to get databases by ID
kxi_kxic_db_get_failure_total counter Number of failed calls to get databases by ID
kxi_kxic_db_update_histogram_seconds histogram Count and time taken to update databases
kxi_kxic_db_update_failure_total counter Number of failed calls to update databases
kxi_kxic_db_delete_histogram_seconds histogram Count and time taken to delete databases
kxi_kxic_db_delete_failure_total counter Number of failed calls to delete databases
kxi_kxic_db_count gauge Number of existing databases

pipeline

Relates to the management of pipelines from the UI. Available metrics are:

metric name type description
kxi_kxic_pipeline_list_histogram_seconds histogram Count and time taken to list pipelines
kxi_kxic_pipeline_list_failure_total counter Number of failed calls to list pipelines
kxi_kxic_pipeline_create_histogram_seconds histogram Count and time taken to create pipelines
kxi_kxic_pipeline_create_failure_total counter Number of failed calls to create pipelines
kxi_kxic_pipeline_get_histogram_seconds histogram Count and time taken to get pipelines by ID
kxi_kxic_pipeline_get_failure_total counter Number of failed calls to get pipelines by ID
kxi_kxic_pipeline_update_histogram_seconds histogram Count and time taken to update pipelines
kxi_kxic_pipeline_update_failure_total counter Number of failed calls to update pipelines
kxi_kxic_pipeline_delete_histogram_seconds histogram Count and time taken to delete pipelines
kxi_kxic_pipeline_delete_failure_total counter Number of failed calls to delete pipelines
kxi_kxic_pipeline_count gauge Number of existing pipelines

stream

Relates to the management of streams from the UI. Available metrics are:

metric name type description
kxi_kxic_stream_list_histogram_seconds histogram Count and time taken to list streams
kxi_kxic_stream_list_failure_total counter Number of failed calls to list streams
kxi_kxic_stream_create_histogram_seconds histogram Count and time taken to create streams
kxi_kxic_stream_create_failure_total counter Number of failed calls to create streams
kxi_kxic_stream_get_histogram_seconds histogram Count and time taken to get streams by ID
kxi_kxic_stream_get_failure_total counter Number of failed calls to get streams by ID
kxi_kxic_stream_update_histogram_seconds histogram Count and time taken to update streams
kxi_kxic_stream_update_failure_total counter Number of failed calls to update streams
kxi_kxic_stream_delete_histogram_seconds histogram Count and time taken to delete streams
kxi_kxic_stream_delete_failure_total counter Number of failed calls to delete streams
kxi_kxic_streams_count gauge Number of existing streams

assembly

Relates to the management of assemblies from the UI. Available metrics are:

metric name type description
kxi_kxic_assembly_list_histogram_seconds histogram Count and time taken to list assemblies
kxi_kxic_assembly_list_failure_total counter Number of failed calls to list assemblies
kxi_kxic_assembly_create_histogram_seconds histogram Count and time taken to create assemblies
kxi_kxic_assembly_create_failure_total counter Number of failed calls to create assemblies
kxi_kxic_assembly_get_histogram_seconds histogram Count and time taken to get assemblies
kxi_kxic_assembly_get_failure_total counter Number of failed calls to get assemblies
kxi_kxic_assembly_update_histogram_seconds histogram Count and time taken to update assemblies
kxi_kxic_assembly_update_failure_total counter Number of failed calls to update assemblies
kxi_kxic_assembly_delete_histogram_seconds histogram Count and time taken to delete assemblies
kxi_kxic_assembly_delete_failure_total counter Number of failed calls to delete assemblies
kxi_kxic_assembly_deploy_histogram_seconds histogram Count and time taken to deploy assemblies
kxi_kxic_assembly_deploy_failure_total counter Number of failed calls to deploy assemblies
kxi_kxic_assembly_export_histogram_seconds histogram Count and time taken to export assemblies
kxi_kxic_assembly_export_failure_total counter Number of failed calls to export assemblies
kxi_kxic_assembly_teardown_histogram_seconds histogram Count and time taken to teardown assemblies
kxi_kxic_assembly_teardown_failure_total counter Number of failed calls to teardown assemblies
kxi_kxic_assembly_count gauge Number of existing assemblies

discovery-proxy

Each call to a discovery-proxy API endpoint generates metrics.

metric name type description
kxi_sd_register_histogram_seconds histogram Count and time taken to register services with Discovery
kxi_sd_register_failure_total counter Number of failed calls to register services with Discovery
kxi_sd_updateDetails_histogram_seconds histogram Count and time taken to service details with the registry
kxi_sd_updateDetails_failure_total counter Number of failed calls to service details with the registry
kxi_sd_getServices_histogram_seconds histogram Count and time taken to get the latest services from the registry
kxi_sd_getServices_failure_total counter Number of failed calls to get the latest services from the registry
kxi_sd_heartbeat_histogram_seconds histogram Count and time taken to heartbeat with the registry
kxi_sd_heartbeat_failure_total counter Number of failed calls to heartbeat with the registry
kxi_sd_updateStatus_histogram_seconds histogram Count and time taken to update status with registry
kxi_sd_updateStatus_failure_total counter Number of failed calls to update status with registry
kxi_sd_deregister_histogram_seconds histogram Count and time taken to deregister from the registry
kxi_sd_deregister_failure_total counter Number of failed calls to deregister from the registry
kxi_sd_alive_histogram_seconds histogram Count and time taken to check the proxy process is responsive
kxi_sd_alive_failure_total counter Number of failed calls to check the proxy process is responsive
kxi_sd_ready_histogram_seconds histogram Count and time taken to check the proxy process is ready
kxi_sd_ready_failure_total counter Number of failed calls to check the proxy process is ready

kdb Insights Operator

kdb Insights Operator generates metrics on each attempt to interact with the SP Coordinator or Keycloak instance of each namespace.

metric name type description
kxi_operator_keycloak_errors_total counter Number of failed requests to Keycloak
kxi_operator_keycloak_request_seconds histogram Count and time taken to make requests to Keycloak
kxi_operator_pipeline_errors_total counter Number of failed calls to the SP Coordinator
kxi_operator_pipeline_request_seconds histogram Count and time taken to make requests to the SP Coordinator

information service

Each call to an information service API endpoint generates metrics.

metric name type description
kxi_info_details_histogram_seconds histogram Count and time taken to get details for a specific client ID
kxi_info_details_failure_total counter Number of failed calls to get details for a specific client ID

client-controller

Each call to a client-controller API endpoint generates metrics.

metric name type description
kxi_com_kx_cc_enrol_histogram_seconds histogram Count and time taken to enroll clients
kxi_com_kx_cc_enrol_failure_total counter Number of failed calls to enroll clients
kxi_com_kx_cc_leave_histogram_seconds histogram Count and time taken to remove clients
kxi_com_kx_cc_leave_failure_total counter Number of failed calls to remove clients

reliable transport

The Reliable Transport, also known as a stream, publishes status and performance metrics by default. These can be disabled by setting the environment variable RT_EXPORT_METRICS="0".

metric name type Description Component Node
kxi_rt_seq_leader gauge Leadership status of node sequencer all
kxi_rt_in_bytes_total counter Count of input bytes sequenced sequencer leader
kxi_rt_in_messages_total counter Count of input messages sequenced sequencer leader
kxi_rt_in_bytes counter Count of input bytes sequenced per directory sequencer leader
kxi_rt_in_messages counter Count of input messages sequenced per directory sequencer leader
kxi_rt_out_bytes_total counter Count of bytes merged merger all
kxi_rt_out_messages_total counter Count of messages merged merger all
kxi_out_bytes counter Count of bytes merged per directory merger all
kxi_out_messages counter Count of messages merged per directory merger all
kxi_rt_merge_queue_size gauge Merge instructions waiting in queue merger all

Note

  • As well as exporting the total number of messages and bytes transferred (*_total metrics), RT exports the number of messages and bytes transferred per publisher, using a directory label set to topicname.hostname.

  • The sequencer leader metric is set on restart/leader change, therefore it is always up to date. The other sequencer metrics are updated every second, but only on the leader node and should be ignored for all other nodes.

  • The merger metrics are all updated every second for all nodes.

Labels

All metrics include the following labels:

label description example
ha_type HA configuration 3-node
raft_node_index The node index from the hostname 0

Metrics defined for each individual publisher also include the following labels:

label description example
directory Name of the directory topicname.hostname
dedup_stream Name of the topic, if the input stream is being deduplicated topicname