Refinery data replication features

Refinery has in-built capability to copy data from one side of a fault tolerant install to another. The purpose of this capability is to achieve data synchronicity between the PROD and DR (a and b) sides of a hot-hot system.

Manual data copy

The Refinery CLI can be used to manage a data copy. To copy the data the daas_dataMergeDaemon must be running on both the source host (to initiate the copy) and the target hosts (to integrate data into the hdb).

The CLI command will look up the configured locations of associated pipelines and distribute the data to the relevant merge daemon.

Consider the CLI command below

refinery datacopy --copy --pipeline myPipeline --instance 0 --date 2000.01.01

This will trigger a copy of myPipeline.0.hdb.x to the hosts of myPipeline.X.hdb.X.

Upon the call of the datacopy command, the relevant data is compressed with tar and copied with rsync to the other hosts. At this point it is reliant on the target daas_dataMergeDaemon to integrate it into the hdb; you can track this in each of the target logs.

Note - Out of the box there is only daas_dataMergeDaemon_a and _b; therefore, only datacopy between instance 0 and 1. To add multi-host copy, the merge daemon must be duplicated onto each host.

Note - For a successful data copy, SSH keys must be set up between hosts and in the authorized keys file.

Automatic data copy

Automatic data copy is not available out of the box due to the complexity of determining which instance has the "golden copy" of the data. However, there is functional support for an automatic copy.

.dataCopy.runAuto[] will copy the previous days data. An engineer may hook this into the system with .event.addListener[`hdb.reload.complete;`CUSTOM_FUNCTION_NAME] with some logic in their custom function to determine the correct source HDB.

Pre-version 5.x

The datacopy feature behaves slightly differently in legacy versions. It is configurable as to whether this process is automatic or manual. By default, the automatic datacopy is off. However, this will be configured on install. The config .daas.cfg.autoHDBCopy controls this. It is viewable through the data copy dashboard, but is not editable there.

To trigger the HDB copy to the secondary instance post EOD merge, additional steps are required when performing the task manually:

  • The .daas.trth.runhdbCopyToSecondary needs to be set to true in the .daas.cfg.trth.frameworkSettings for this operation to work manually. However, EOD jobs that run overnight do not require this setting.

Automatic data copy


Automatic data copy is intended to be used to assist recover and historical data repair after a failover event. It is not recommended to be configured as a daily job as by default as the copy process permanently overwrites captured data on the destination side.

If the automatic synchronization is turned on, then it will behave as listed below. The copy will stay in this configuration until either an administrator forces the data master change or a mirrored failover event occurs.

Scenario Copy Reason
Normal operation A to B -
Feed failure on A A to B Data outages on A
Tickerplant failure on A A to B Potential data outages on A
RDB failure on A A to B Data is populated from Tickerplant logs, no data loss
HDB/RTE/other failure on A A to B No effect on data capture

Manual data copy


The first tab of the data copy dashboard is the manual data copy control. On the left-hand side are dropdowns to select which copy will be done and a button to execute the copy. The right-hand side is a status log of the most recent data copy:

Note

Most likely reason for failure: if the data copy is not successful, it could be due to improper setup of scp between the two machines. SSH keys must be set up to allow non-prompted scp between the box, and hostnames of the machines must resolve correctly. The log will print out any SCP commands being called to allow easy debugging.

The second tab is an overall history of data copy within the system. It will show start and end times of any manual copies as well as the broken down log of automatic data copies:

The penultimate tab shows the configuration of auto copy:

The final tab allows you to specify which direction the automatic copy will go if enabled. Data will be copied from isMaster. You can edit it by double clicking the fields in the isMaster column and saving: