Skip to content

Sizing Guidance - User Node Pool

The 'User Node Pool' on your Azure AKS cluster is the powerhouse for your data capture, processing and querying. The Reference Lookup aims to provide a quick guideline on the initial size for systems until their exact usage profile is established.

Use-cases

The following are some specific use cases. For variations see Reference Lookup.

persona description suggested 'user node pool'
Data Scientist Expects to work with datasets of up to 10 million records per day (4 GiB / day)
using queries of Moderate complexity
3 x Standard_D8s_v5
Data Engineer Expects to connect real-time financial datasets of up to 4 billion records per day (600 GiB / day).
Streaming logic of Medium Memory Usage will compliment Complex queries.
4 x Standard_D64ds_v5

Reference Lookup

With reference to the definitions for Query Complexity and Streaming Logic below, the following table provides guidance on the User Node Pool sizes for data volumes up to the X GiB / day listed in the column header

query complexity streaming logic 10 GiB / day 30 GiB / day 750 GiB / day 2000 GiB / day 3000 GiB / day 4000 GiB / day
Simple Low Memory Usage 4 x 16 3 x 32 3 x 128 3 x 256 3 x 384 3 x 512
Simple Medium Memory Usage 4 x 16 3 x 32 3 x 128 3 x 256 3 x 384 3 x 512
Simple High Memory Usage 5 x 16 4 x 32 4 x 128 4 x 256 4 x 384 4 x 512
Moderate Low Memory Usage 4 x 16 3 x 32 3 x 128 3 x 256 3 x 384 3 x 512
Moderate Medium Memory Usage 4 x 16 4 x 32 4 x 128 4 x 256 4 x 384 4 x 512
Moderate High Memory Usage 5 x 16 4 x 32 4 x 128 4 x 256 4 x 384 4 x 512
Complex Low Memory Usage 4 x 32 3 x 64 4 x 256 4 x 384 4 x 512 4 x 672
Complex Medium Memory Usage 4 x 32 4 x 64 4 x 256 4 x 384 4 x 512 4 x 672
Complex High Memory Usage 4 x 32 4 x 64 4 x 256 4 x 384 4 x 512 4 x 672

Note: A number of Data Access points are deployed by default. To service additional concurrent queries these may need to be scaled further

Query Complexity

query complexity description
Simple Short time windows (e.g. small result sets)
Non-complex query logic
Quick execution < 10ms
Moderate Large time windows with aggregations (e.g. small result sets)
Execution time < 1sec (although <500ms should cover most)
Complex Large time windows and/or large datasets
Complex query logic
Execution time > 1sec

Streaming Logic

streaming logic description
Low Memory Usage In-flight calculations
Storage only
Decoding of file format for ingestion and storage
Medium Memory Usage Transformations: simple aggregations and time bucketing
High Memory Usage Complex data joins over significant time periods
In-flight actions (ML, AI)
OR Multiple medium memory pipelines

FAQ

How much data do I have ?

For the majority of use-cases the amount of data being captured is the biggest factor driving the infrastructure sizing.

This table provides guidance on data volumes assuming a 50 column table.

range rows / day (realtime) node size for data capture(GiB) SKU (excluding local storage) SKU (including local SSD
storage for rook-ceph)
< 30 GiB / day 90,000,000 32 Standard_D8s_v5 rook-ceph not recommended given
the additional resource requirement
< 75 GiB / day 200,000,000 64 Standard_D16s_v5 Standard_D16ds_v5
75 => 1000 Gi day 3,000,000,000 128 Standard_D32s_v5 Standard_D32ds_v5
1000 => 2500 GiB day 7,000,000,000 256 Standard_E32s_v5 /
Standard_D64s_v5
Standard_E32ds_v5 /
Standard_D64ds_v5
2500 => 3500 GiB day 10,000,000,000 384 Standard_E48s_v5 /
Standard_D96s_v5
Standard_E48ds_v5 /
Standard_D96ds_v5
3500 => 5000 GiB day 14,000,000,000 512 Standard_E64s_v5 Standard_E64ds_v5

Notes:

  • For sizing purposes the concept of fields is used. Field entries are based on the multiplication of rows by columns e.g 15 fields could be 5 rows x 3 columns or vice versa. For estimation a field size of 8 bytes is used (for variations see https://code.kx.com/q/basics/datatypes/).
  • SKUs are for guidance only. For performance, cost, quota or configuration preferences these may not be suitable for all use-cases.

What if my requirements change?

Sizing requirements can be adjusted via configuration changes, often with little interruption to your system. Right-sizing and cost optimization are easiest with a predictable usage profile.

What else impacts infrastructure sizing?

Late Data

If your use case involves a considerable amount of late data this will impact your sizing needs.

vCPU

The memory required to capture data often provides ample vCPU for the associated processing and query workloads e.g. a 128 GiB server will often include 32 vCPU.

Exceptions to this rule would be:

  1. complex data pipelines - for example pipelines leveraging multiple workers may need additional vCPU to maximize throughput
  2. additional shards - where data is split to reduce the max memory requirement, this does also distribute, and slightly increase, the vCPU burden.

Why do I need 3 nodes?

The resilience model utilized requires at least 3 nodes in this pool (see docs on RT).