Skip to content

Refinitiv data ingestion

Historical

Below are the Thomson Reuters and Refinitiv data ingestion pipelines currently available in the FSI accelerator fsi-data-assembly package. All the Refinitiv pipelines are configurable. For more details on how to configure them, see Pipeline Configuration.

  • Refinitiv Real-Time: Reads and ingests real-time trade and quotes from the Refinitiv RTO market data feed.
  • TRTH Trades: Reads and ingests historical trade data from an exported CSV of Thomson Reuters Tick History (TRTH).
  • TRTH Quotes: Reads and ingests historical quote data from an exported CSV of Thomson Reuters Tick History.
  • TRTH Cancels & Corrections: Reads and ingests historical trade cancellations and corrections from an exported CSV of Thomson Reuters Tick History.

Refinitiv feed handler

For detailed information about the Refinitiv feed handler, please refer to the documentation here.

Pre-requisites

Before using the refinitiv feed handler, ensure that the rt-refinitiv-pub helm chart has already been pulled down in a local session and the kubernetes secret to be used to login to Refinitiv has been created.

Once the helm chart is pulled down, use the refinitiv-feed.yaml in the feed folder of the fsi-data-assembly package to deploy the feed:

>helm install rt-refinitiv-pub rt-refinitiv-pub -n YOUR_NAMESPACE  -f feed/refinitiv-feed.yaml

Note

Before running this helm install command, make sure to update the sinkName in refinitiv-feed.yaml from rt-fsi-data-assembly-fsi-north to rt-client-pkg-fsi-north.

Real-Time Data

The FSI Accelerators real-time refinitiv pipeline, called refinitivrealtime, subscribes to a live feed of trade and quote data coming from the Refinitiv Feed Handler. When deploying the Refinitiv Feed Handler, it is reccomended that the yaml file refinitiv-feed.yaml is used, as it's sinkName is pre-configured to publish to the refintivrealtime pipeline. This yaml file can be found in the feed directory of the fsi-data-assembly package.

There are a few differences between this refinitiv real-time pipeline and the other pipelines, which are listed below:

  • Unlike other FSI accelerator pipelines, it is not advised to change the reader of the refinitivrealtime pipeline, since it relies on the reader .qsp.read.fromStream to receive a live stream of data from the upstream feed handler.
  • The refinitivrealtime pipeline has two target tables that can be configured rather than one; .fsi.targetTradeTable and .fsi.targetQuoteTable.
  • The refinitivrealtime pipeline has teo sets of column and value mappings to be configured, one set for the target trade table and one for the target quote table.

TRTH Market Data

The FSI accelerator package has three pipelines for ingesting TRTH (Thomson Reuters Tick History) market data. Each pipeline defaults to reading the source TRTH data from an S3 bucket in CSV format.

  • trthquotes: Ingests historical quote data from TRTH (Thomson Reuters Tick History) via a CSV export. By default, this data is written to the Quote table. The following should be noted about this pipeline (particularly when configuring it):

    • The follwing column must be present in the source data: Type.
    • The following columns must be present as a KXColumnName in the pipeline's column mapping and as a column in the pipeline's schema: eventTimestamp and gmtOffset.
    • The target table of the pipeline must have an eventTimestamp column.
    • Only rows where Type=Quote in the source data are ingested.
    • The eventTimestamp column is updated based on the gmtOffset column to ensure all timestamps are ingested in GMT.
  • trthtrades: Ingests historical trade data from TRTH via a CSV export. By default, this data is written to the Trade table. The following should be noted about this pipeline (particularly when configuring it):

    • The following columns must be present in the source data: Price and Type.
    • The following columns must be present as a KXColumnName in the pipeline's column mapping and as a column in the pipeline's schema: eventTimestamp and gmtOffset.
    • The target table of the pipeline must have an eventTimestamp column.
    • If the Price column is null (empty) in the source data for a particular row, that row is excluded from ingestion.
    • Only rows where Type=Trade in the source data are ingested.
    • The eventTimestamp column is updated based on the gmtOffset column to ensure all timestamps are ingested in GMT.
  • trthcancor: Ingests real-time and historical data related to trade cancellations and corrections from TRTH using a CSV export. By default, this data is written to the CanCor table. The following should be noted about this pipeline (particularly when configuring it):

    • The following column must be present in the source data: Type.
    • The following columns must be present as a KXColumnName in the pipeline's column mapping and as a column in the pipeline's schema: eventTimestamp, gmtOffset, exchTime and origDate.
    • The target table of the pipeline must have an eventTimestamp and exchTime column.
    • Only rows where Type=Correction in the source data are ingested.
    • The eventTimestamp column is updated based on the gmtOffset column to ensure all timestamps are ingested in GMT.
    • The origDate column is prepended to the exchTime column to form a timestamp. This is done as the Exch Time column in the source data has no date. Once prepended, the exchTime is updated based on the gmtOffset column to ensure all timestamps are ingested in GMT.

Reference Data

The FSI accelerator package has 3 pipelines for ingesting reference data from Refinitiv DataScope. As with the TRTH market data, each pipeline defaults to reading the source data from an S3 bucket in a CSV format.

  • refinstrument: Ingests reference data for financial instruments, including details such as symbol, description, security type, etc. By default, this data is written to the Instrument table.
  • refcorpactions: Ingests corporate actions data, providing information about events such as stock splits, mergers, and acquisitions. By default, this data is written to the CorpActions table.
  • refdividendrecords: Ingests dividend records data, including dividend payments and associated details. By default, this data is written to the DividendRecords table.