Send Feedback
Skip to content

Ingest Historic OneTick US Comp Data

This page describes how the otdataloader pipeline ingests historic OneTick US Comp data from an AWS S3 bucket into kdb Insights Enterprise.

Prerequisites

The otdataloader pipeline ingests OneTick data from an AWS S3 bucket managed by KX and OneTick. Confirm that your environment can access this bucket before running the pipeline.

The bucket has the following structure:

$ aws s3 ls s3://${BUCKET}/${PREFIX}/
                           PRE hdb/
                           PRE status/
2026-06-11 17:23:03    1314676 sym

The bucket contains the following directories and files:

  • status/ directory: contains text files that signal when a day of data is ready for ingestion by the otdataloader pipeline.

    • Files follow the naming convention finished_YYYY_MM_DD.txt, where YYYY_MM_DD represents the year, month, and date of the data ready to ingest.
    • The otdataloader pipeline monitors this directory and checks for new data every 20 minutes.
  • hdb/ directory: contains OneTick US Comp data in kdb date-partitioned format. The otdataloader ingests this data when triggered by files in the status/ directory.

  • sym file — contains all sym enumerations for the data in the hdb/ directory. The otdataloader copies this file alongside the date directory when triggered by files in the status/ directory.

Pipeline options

Several environment variables control the behavior of the otdataloader pipeline. Set these at runtime using the CLI. For details, see Inject environment variables.

Environment Variable Required Purpose Notes
OT_DATA_LOADER_S3_URI yes Path to bucket containing OneTick data which is managed by KX and OneTick. Format: s3://<BUCKET_NAME>/<PREFIX>/
Should be the full URI as described in the prerequisites section
OT_DATA_LOADER_REGION yes AWS_REGION of OT_DATA_LOADER_S3_URI.
AWS_ACCESS_KEY_ID yes Access Key ID which allows the otdataloader to access OT_DATA_LOADER_S3_URI.
AWS_SECRET_ACCESS_KEY yes Secret Access Key which allows the otdataloader to access OT_DATA_LOADER_S3_URI.
AWS_SESSION_TOKEN no Session token which allows the otdataloader to access OT_DATA_LOADER_S3_URI. Optional - may not be required if using long-running AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
START_DATE no The date from which the pipeline should begin ingesting partitions from OT_DATA_LOADER_S3_URI. Format = YYYY.MM.DD.
DATE_LIMIT no Limits how many dates can be ingested in parallel. Defaults to 1.

Example

The below command starts the otdataloader pipeline with all options set:

# First Set Environment Variables to be used by pipeline
OT_DATA_LOADER_S3_URI='s3://<BUCKET_NAME>/<PREFIX>/'
OT_DATA_LOADER_REGION=<INSERT_REGION>
START_DATE=<INSERT_START_DATE_FOR_SP_INGESTION_FROM_S3>
DATE_LIMIT=2
AWS_ACCESS_KEY_ID=<INSERT_AWS_ACCESS_KEY_ID>
AWS_SECRET_ACCESS_KEY=<INSERT_AWS_SECRET_ACCESS_KEY>
AWS_SESSION_TOKEN=<INSERT_AWS_SESSION_TOKEN>
# Note: Variables `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` and `AWS_SESSION_TOKEN` can be quickly set using the command `eval $(aws configure export-credentials --format env)`

# Command to deploy the SP pipeline
kxi pm deploy fsi-app-ot-uscomp --pipeline otdataloader --env otdataloader:OT_DATA_LOADER_S3_URI=$OT_DATA_LOADER_S3_URI --env otdataloader:OT_DATA_LOADER_REGION=$AWS_REGION --env otdataloader:AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID --env otdataloader:AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY --env otdataloader:AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN --env otdataloader:START_DATE=$START_DATE --env otdataloader:DATE_LIMIT=$DATE_LIMIT

Omit any --env flags for variables you do not need.

Next steps

Back to top