Create pipelines for cluster deployments

Cluster deployment system sample

system:
  layout:
    -
      name: primary-server
      nodes:
        -
          host: aaa.host.com
    -
      name: secondary-server
      nodes:
        -
          host: bbb.host.com

  default-cpu-taskset: 0-256

  data-hierarchy:
    - region
    - data-source
    - data-class
    - sub-class

  delta-messaging-server: DS_MESSAGING_SERVER:refinery_a

  timezone: UTC

  time-sort: false

Cluster deployment pipeline sample

  • A pipeline can be configured to split instances of a process across servers.
  • The instances are distinguished by a numbered suffix e.g. rdb.0 or rdb.1
  • The YAML schema below creates two identical pipelines, one for each host.
pipeline:
  name: "DemoPipeline"
  type: "realtime"

  expose-to-gw: true

  proc-layout:
  # Example of how process instances can be split across servers
  #  -
  #    tp.0: primary-server
  #    tp.1: secondary-server
  #    hdb.0: primary-server
  #    hdb.1: secondary-server
    -
      all: primary-server
    -
      all: secondary-server

  taxonomy:
    region: test
    data-source: demo

  processes:
    tp:
      port: 34231
      pub-mode: timer
      pub-freq-ms: 100
      log-to-journal: true
      rollover-mode: daily-at-time
      rollover-time: "00:00:00.001"
      enable-analyst: true
      subscribe-from-delta-messaging: true
    rdb:
      port: 34232
      timeout: 30
      enable-analyst: true
    hdb:
      port: 34232
      timeout: 30
      enable-analyst: true

Cluster identification

Clusters are identified by the indentation and use of - within the proc-layout section. As seen in the proc-layout of the example above:

proc-layout:
  -
    all: primary-server
  -
    all: secondary-server

The use of 2 -'s means that there are two clusters/pipelines being created. The assignment of various different processes and which server they use can be assigned using the same format to configure the system to use multiple clusters then this indentation and - (dash) usage is key to achieve this.

See creating environment based pipelines or process-recovery-guide for more information on using clusters in proc-layouts.