Quick start guide

After installing Refinery 'core', the following configuration is required:

  • The DC server (localhost) available to run pipelines
  • 1 entrypoint pipeline:
    • DefaultEntrypoint - containing 1 Gateway and 1 UDF processor
  • 2 data capture pipelines:

To add pipelines for data capture, additional configuration needs to be defined and must form part of a domain and/or client package to layer on top of Refinery 'core'. See Creating Pipelines for more information.

Note

From Refinery 5.0, the Refinery CLI is required to start the application correctly

Starting all processes

The application start order is:

  • Start Delta Control
    • (Optionally start Delta Control Daemon)
  • Start the core workflows
    • REFINERY_CORE_A and REFINERY_ENTRYPOINT_0_a
    • (Optionally start REFINERY_CORE_B and REFINERY_ENTRYPOINT_0_b for an HA installation)
  • Start the Refinery Process Manager
  • Start the entrypoint pipelines
  • Start the data capture pipelines
# Start Delta Control
refinery application --start-control

# (Optional) Start Delta Control Daemon
refinery application --start-daemon

# Start core workflows
refinery workflow --start-core --environment A

# Start Process Manager
refinery process-manager --start --wait

# Start entrypoint pipeline
refinery pipeline --start DefaultEntrypoint

# Start Gateway client
refinery service-class --start-template refinery-gw-client

# Start capture pipelines
refinery pipeline --start *pipeline-1*,*pipeline-2*

Process Manager

The Process Manager is responsible for running all Refinery processes defined by YAML configuration.

Process names

All processes managed by the Process Manager have a fixed process name to make them easy to identify:

*pipeline*.*pipeline-instance*.*process-type*.*process-instance*
  • Pipeline: The name of the pipeline the process belongs to
  • Pipeline Instance: The instance of the pipeline
    • Used for redundancy
  • Process Type: The type of process (e.g. rdb or hdb)
  • Process Instance: The instance of the specific process type
    • Used for sharding or clustering of processes

Cluster deployment

Delta Control

On a cluster deployment, Delta Control must be started on all hosts individually, beginning with the primary host followed by subsequent hosts. The DeltaControl.log on the primary host will show the Control process attempting to make a connection with the Control processes on the subsequent hosts.

<->2022.01.12D03:31:25.385 ### aaa.host ### normal ### (6859): Attempting to open connection to DeltaControl on `host`port!(`bbb.host.com;15001i) ###

Once this connection has been established, it will be confirmed in DeltaControl.log on subsequent hosts.

<->2022.01.12D03:31:52.955 ### bbb.host ### normal ### (31509): New list of active peers received from leader on aaa.host.com:15001 at uid 24 ### `aaa.host.com:15001`bbb.host.com:15001
host                     port  handle uid lead dr retry leadconn
----------------------------------------------------------------
aaa.host.com             15001 7      24  1    0        1
bbb.host.com             15001 0      0   0    0        0

You can look up DeltaControl.log using the command:

refinery logs --view --process DeltaControl

Note

The refinery logs commands can only find logs for local processes

Delta Control daemon

The Delta Control daemon must be started on all hosts individually, beginning with the primary host, followed by subsequent hosts. DeltaControl.log on the primary host will confirm logon of a daemon from all hosts.

<->2022.01.12D03:32:10.967 ### aaa.host ### normal ### (6859): Daemon logon from aaa.host.com:0 ### `ip`OS`handle!(`171.25.0.4;`linux;8i)
<->2022.01.12D03:32:16.873 ### aaa.host ### normal ### (6859): Daemon logon from bbb.host.com:0 ### `ip`OS`handle!(`171.25.0.2;`linux;10i)

Process Manager

The Process Manager is started from the primary host only and is shared across hosts. Only one Process Manager is needed for the cluster.

Workflows

All workflows are started from the primary host only.

# Start Core Workflow A
refinery workflow --start-core --environment A

# Start Core Workflow B
refinery workflow --start-core --environment B

The daemon launches the core workflows.

Pipelines

Pipelines can be started from any host. They are launched by the Process Manager.

Full sequence

# Start Delta Control (Run on primary host, then secondary host)
refinery application --start-control

# Start Delta Control Daemon (run on primary host, then secondary host)
refinery application --start-daemon

# Start Process Manager (run on primary host)
refinery process-manager --start --wait

# Start Core Workflow A (run on primary host)
refinery workflow --start-core --environment A

# Start Core Workflow B (run on primary host)
refinery workflow --start-core --environment B

# Start pipelines (run on primary host)
refinery pipeline --start *pipeline-1*,*pipeline-2*

Shutting down all processes

You can shut down the system with the following command:

refinery application --stop-all

Clustered deployment

On a clustered deployment, you must run the shutdown command first on the secondary host then on the primary host.