Skip to content

Build and backtest scalable trading strategies using real time and historical tick data

This page provides an example of how kdb Insights Enterprise can be used to work with both live and historical data.

kdb Insights Enterprise can be used to stream live data along with historical tick data to develop interactive analytics; powering faster, and better in-the-moment decision-making. This provides the following benefits.

  • Streaming live tick data alongside historical data provides the big picture view of the market which enables you to maximize opportunities.
  • Real-time data enables you to react to market events as they happen to increase market share through efficient execution.
  • Aggregating multiple liquidity sources provides for a better understanding of market depth and trading opportunities.

Financial dataset

The example in this page provides a real-time price feed alongside historical trade data. The real-time price feed calculates two moving averages. A cross between these two moving averages establishes a position (or trade exit) in the market. The following sections guide you through the steps to:

Build a database

  1. Click Create new under Databases in the Quick Actions panel, on the Overview page.

    Select a build a database.

  2. In the Create Database dialog:

    • Enter equities as the Database Name.
    • Click Select a Package and Create new package.
    • Enter equities as the Package Name.
    • Click Create.
  3. On the Schema Settings tab click Code View to open the Schema Code View. You can use this to add large schema tables, in JSON format.

  4. Replace the existing code with the following JSON.

    equities schema

    Paste the JSON code into the code editor:

    [
        {
            "columns": [
            {
                "attrDisk": "sorted",
                "attrOrd": "sorted",
                "description": "KXI Assembly Schema Column",
                "name": "timestamp",
                "type": "timestamp",
                "compound": false,
                "attrMem": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "price",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "volume",
                "type": "int",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            }
            ],
            "description": "KXI Assembly Schema",
            "name": "trade",
            "primaryKeys": [],
            "prtnCol": "timestamp",
            "sortColsDisk": [
                "timestamp"
            ],
            "sortColsMem": [],
            "sortColsOrd": [
                "timestamp"
            ],
            "type": "partitioned"
        },
        {
            "columns": [
            {
                "description": "KXI Assembly Schema Column",
                "name": "Date",
                "type": "date",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "attrDisk": "sorted",
                "attrOrd": "sorted",
                "description": "KXI Assembly Schema Column",
                "name": "Time",
                "type": "timestamp",
                "compound": false,
                "attrMem": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "Open",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "High",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "Low",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "Close",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "AdjClose",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "Volume",
                "type": "long",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "AssetCode",
                "type": "symbol",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            }
            ],
            "description": "KXI Assembly Schema",
            "name": "close",
            "primaryKeys": [],
            "prtnCol": "Time",
            "sortColsDisk": [
                "Time"
            ],
            "sortColsMem": [],
            "sortColsOrd": [
                "Time"
            ],
            "type": "splayed"
        },
        {
            "columns": [
            {
                "attrDisk": "sorted",
                "attrOrd": "sorted",
                "description": "KXI Assembly Schema Column",
                "name": "timestamp",
                "type": "timestamp",
                "compound": false,
                "attrMem": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "vwap",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "twap",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "open",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "high",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "low",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            },
            {
                "description": "KXI Assembly Schema Column",
                "name": "close",
                "type": "float",
                "compound": false,
                "attrMem": "",
                "attrOrd": "",
                "attrDisk": "",
                "foreign": ""
            }
            ],
            "description": "KXI Assembly Schema",
            "name": "analytics",
            "primaryKeys": [],
            "prtnCol": "timestamp",
            "sortColsDisk": [
                "timestamp"
            ],
            "sortColsMem": [],
            "sortColsOrd": [
                "timestamp"
            ],
            "type": "partitioned"
        }
    ]
    
  5. Click Apply.

  6. Click Save.

Ingest live data

The live data feed uses Apache Kafka. Apache Kafka is an event streaming platform whose data is easily consumed and published by kdb Insights Enterprise.

  1. On the Overview page, choose Import Data under Databases:

    Select a build a database.

  2. In the Import your data screen select Kafka.

    Import your data.

  3. In the Configure Kafka screen set the following:

    Setting Value
    Broker* kafka.trykdb.kx.com:443
    Topic* spx

    Keep the default values for the following settings.

    Setting Value
    Offset* End
    Use TLS Unchecked
    Use Schema Registry Unchecked
  4. Expand the Advanced parameters section and tick Advanced Broker Options.

  5. Click +, under Add an Advanced Configuration, to add the following key value-pairs:

    key value
    sasl.username demo
    sasl.password demo
    sasl.mechanism SCRAM-SHA-512
    security.protocol SASL_SSL
  6. Click Next.

  7. In the Select a decoder screen click JSON.

  8. In the Configure JSON screen click Next, leaving Decode each unchecked.
  9. In the Configure Schema screen:

    • Leave the value of Data Format set to Any

    • Click the Load Schema icon and select the following values

      • Database = equities
      • Table = trade
    • Click Load and then click Next.

  10. In the Configure Writer screen set the following values:

    • Database = equities
    • Table = trade

    Keep the default values for the following settings.

    Setting Value
    Write Direct to HDB Unchecked
    Deduplicate Stream Checked
    Set Timeout Value Unchecked
  11. Click Open Pipeline to display the Create Pipeline dialog.

    • Enter equities-1 as the Pipeline Name
    • Click Select a Package and select equities
    • Click Create
  12. In the pipeline template, click and drag the Map Function node into the pipeline template workspace.

    • Right-click on the join between the Decoder and Transform nodes and click Remove Edge.
    • Insert the Map function between the Decoder and Transform nodes by dragging-and-connecting the edge points of the Decoder node to the Map node, and from the Map node to the Transform node, as shown below.

      Kafka pipeline with the inserted **Map** function node between JSON **Decoder** node and schema **Transform** node.

    • Click on the Map function node. In the Q tab of the Configure Map Node panel replace the code with the following.

      {[data]
          enlist data
      }
      
    • Click Apply to save the details to the node.

  13. Click Save.

Ingest historical data

Historic data is kept on object storage, on Amazon S3.

  1. On the Overview page, choose Import Data under Databases in the Quick Actions panel.

    Select a build a database.

  2. In the Import your data screen select Amazon S3.

    Import your data.

  3. In the Configure Amazon S3 screen:

    • Complete the properties:

      Setting Value
      S3 URI* s3://kxs-prd-cxt-twg-roinsightsdemo/close_spx.csv
      Region* eu-west-1
      Tenant kxinsights

      Keep the default values for the following settings.

      Setting Value
      File Mode* Binary
      Offset* 0
      Chunking* Auto
      Chunk Size* 1MB
      Use Watching Unchecked
      Use Authentication Unchecked
    • Click Next

  4. In the Select a decoder screen click CSV and keep the default settings.

  5. In the Configure CSV screen keep all the default values and click Next.
  6. In the Configure Schema screen:

    • Leave Data Format unchanged as Any
    • Click the Load Schema icon

      • Select equities as the Database
      • Select close as the Table
    • Click Load to apply the schema

    • Click Next
  7. In the Configure Writer screen:

    • Define the following settings:

      Setting Value
      Database equities
      Table close

      Keep the default values for the following settings.

      Setting Value
      Write Direct to HDB Unchecked
      Deduplicate Stream Checked
      Set Timeout Value Unchecked
    • Click Open Pipeline.

  8. In the Create Pipeline screen:

    • Enter close-1 as the Pipeline Name
    • Click Select a Package and select equities
    • Click Create
  9. Click Save.

  10. Click on Packages, in the left-hand menu to open the Packages index.

  11. Click on the three dots to the right of the equities package and click Deploy.

    The Deploy Package dialog open. Click Deploy.

    Deploy equities package

    Deploying the package, deploys the database and associated pipelines. These read the data from its source, transform it to a kdb+ compatible format, and write it to the database.

  12. Click on equities under Databases in the left-hand menu to open the database screen. Hover over the status, beside the Name field to view the deployment progress for the database and its pipelines.

    Status of package and pipelines

    The data is loaded when:

    • The equities-1 pipeline reaches a Status of Running
    • The close-1 pipeline reaches a Status of Finished

Query the data

  1. Click + on the ribbon menu and click Query.

  2. Click the SQL tab and enter the following command to retrieve a count of streaming events in the trade table:

    SELECT COUNT(*) FROM trade
    
    3. Define the Output Variable as t.

  3. Click Run Query. Re-run the query to get an updated value. The results are displayed in the lower part of the Query screen, as shown below.

    Results

  4. Right-click in the Console and click Clear to clear the results.

  5. In the SQL tab, replace the existing code with:

    SELECT * FROM close
    
  6. Define the Output Variable as t.

  7. Click Run Query. The results are displayed in the Console as illustrated in the example below which shows end-of-day S&P historic prices.

    A SQL query reporting end-of-day S&P historic prices.

Create powerful analytics

Using the data, a trade signal is created to generate a position in the market (S&P).

  1. Replace the text in the code editor with the following query:

    SELECT * FROM trade
    
  2. Set the Output Variable to t.

  3. Click Run Query.

  4. In the output section, toggle between Console, Table or Visual for different views of the data. For example, a chart of prices over time.

    • Click Run Query after changing tabs. The following screenshot shows an SQL query against streaming trade data, viewed in the Table tab.

    A SQL query against streaming trade data, viewed in the Table tab, as a table or a time series chart.

Long and short position simple moving averages

This strategy calculates two simple moving averages, a fast short time moving average, and a slow long time period moving average.

When the fast line crosses above the slow line, we get a buy trade signal, with a sell signal on the reverse cross.

This strategy is also an always-in-the-market signal, so when a signal occurs, it not only exits the previous trade, but opens a new trade in the other direction; going long (buy low - sell high) or short (sell high - buy back low) when the fast line crosses above or below the slow line.

  1. In the Q tab of the Scratchpad panel, in the query window, enter the following code to create a variable analytics to calculate the fast and slow moving average.

    // taking the close price we calculate two moving averages:
    // 1) shortMavg on a window of 10 sec and
    // 2) longMavg on a window of 60
    analytics : select timestamp,
                    price,
                    shortMavg:mavg[10;price],
                    longMavg:mavg[60;price]
                    from t
    
  2. Click Run All.

This strategy uses a 10-period moving average for the fast signal, and a 60-period moving average for the slow signal. Streamed updates for the moving average calculations are added to the chart, as illustrated below.

A "fast" and "slow" moving average zoomed in the visual chart view.

Trade position

The crossover of the two fast and slow moving averages creates trades. These can be tracked with a new variable called positions.

  1. In the scratchpad, append the following:

    // when shortMavg and long Mavg cross each other we create the position
    // 1) +1 to indicate the signal to buy the asset or
    // 2) -1 to indicate the signal to sell the asset
    positions : select timestamp,
                    position:?[shortMavg<longMavg;-1;1],
                    return:log[price % prev price]
                    from analytics
    
  2. Click Run All. A position analytic to track crossovers between the fast and slow moving averages is displayed in the query visual chart, as shown in the following screenshot.

q/kdb+ functions explained

If any of the above code is new to you don't worry, we have detailed the q functions used above:

Function Description
?[x;y;z] When x is true return y, otherwise return z
x<y Returns true when x is less than y, otherwise return false
log To return the natural logarithm
x%y Divides x by y

A position analytic to track crossovers between the "fast" and "slow" moving averages; signals displayed in the query visual chart.

Active versus passive strategy

The current, active strategy is compared to a passive strategy; that is a strategy tracking a major index, like the S&P 500 or a basket of stocks, such as an ETF. We want to know if our strategy performs better.

  • Read more about active and passive strategies

  • In the Scratchpad, append the following code, to create a new variable called performance which generates a benchmark to compare strategy performance to.

    performance : select timestamp,
                        benchmark: exp sums return,
                        strategy: exp sums return * prev position
                    from positions
    
  • Click Run All.

q/kdb+ functions explained

If any of the above code is new to you don't worry, we have detailed the q functions used above:

Function Description
exp Raise e to a power where e is the base of natural logarithms
sums Calculates the cumulative sum
* To multiply
prev Returns the previous item in a list

The results are plotted on a chart, as shown below, which shows that the active strategy outperforms the passive benchmark.

The active moving average crossover strategy is compared to a passive benchmark; the strategy outperforms the benchmark as illustrated in the chart.

Create a View

So far in this tutorial, we have conducted ad hoc comparative analysis conducted in the Scratchpad with the results displayed in the Results tabs. To perform more complex analysis and visualizations we can create a View.

Saving analytics to a database

Before creating a View, we must save the analytics to a database. The quickest way to do this is to modify the data pipeline used by the analytic.

  1. Open the equities-1 pipeline template by selecting it from the left-hand pipeline menu.

  2. If the pipeline is still running, click Teardown and select Clean up resources after teardown.

  3. From the pipeline template view, add a Split node from the list of Function nodes.

    • Right-click the link between the Map and Apply Schema nodes, and click Remove Edge.
    • Connect the Map node to the Split node.
    • Connect the Split node to the Apply Schema node.

    The **Function** Split node positioned between the **Function** Map node and the **Transform** schema node; drag-and-connect the edge points of a node to reconnect pipeline nodes.

    Read more about the Function Split node here.

  4. Add a Timer Window node to the pipeline workspace. The Timer Window aggregates incoming data by time.

    • Connect the Timer Window to the second point-edge of the Split node, as illustrated in the following screenshot.

    • Click on the Timer Window node and, in the Configure Timer Window Node panel, add the value 0D00:00:05 to Period. This ensures the timer runs every 5 seconds.

    Keep the remainder of the settings as default and click Apply.

    Setting Value
    Skip Empty Windows Unchecked
    Use Count Trigger Unchecked
    Accept Dictionaries Checked

    The **Window** _Timer Window_ is configured to fire every 5 seconds. Connect to the **Function** _Split_ node.

    Read more about the Window Timer Windows here.

  5. Next we define the business logic for the analytic in a Map node.

    • Add a Map node from the Functions, and connect it to the Timer Window node.
    • Click on the Map node and, in the Configure Map Node panel, replace the code displayed with the following code.
    // the following code is calculating the fields:
    // 1) vwap = weighted average price, adjusted by volume, over a given time
    // 2) twap = weighted average price over a given time
    // 3) open = price at which a security first trades when an exchange opens for the day
    // 4) high = highest price security is traded at
    // 5) low = lowest price security is traded at
    // 6) close = price at which a security last trades when an exchange closes for the day
    // 7) volume = volume of trades places
    {[data]
        select last timestamp,
           vwap:|[1;volume] wavg price,
           twap:last[price]^(next[timestamp]-timestamp) wavg price,
           open:first price,
           high:max price,
           low:min price,
           close: last price
        from update "P"$timestamp,
             volume:|[1;volume]
        from data
        }
    
    kdb+/q functions explained

    If any of the above code is new to you don't worry, we have detailed the q functions used above:

    • first/last to get first/last record
    • | to return the greater number between 1 and volume
    • ^ to replace any nulls with the previous price
    • wavg to get weighted average
    • - to subtract
    • max to get maximum value
    • min to get minimum value
    • $"P" to cast to timestamp datatype

    The **Function** Map node contains the business logic for the analytic. This is connected to the Timer Window node in the pipeline.

  6. Click Apply.

  7. The data generated by the analytic must be transformed and written to the database.

    • Add an Apply Schema node to the pipeline workspace and connect it to the Map-1 function node.
    • Click on the Apply Schema-1 node and, in the Configure Apply Schema Node panel, click the Load Schema icon.
    • Select equities as the database and analytics as the table and click Load.
    • Click Apply

    Apply Schema

    • Add a kdb Insights Database writer node to the pipeline workspace and connect it to the Apply Schema-1 node.
    • Click on the kdb Insights Database-1 node and, in the Configure Apply Schema Node panel, click the Load Schema icon.
    • Select equities as the database and analytics as the table and click Load.
    • Click Apply

    Connect a **Transform** _Apply Schema_ and **Writer** _KX Insights Database_ node to the **Function** _Map_ node by the analytic.  Ensure the analytics table is used for both the **Transform** and **Writer** node.

  8. Click Save & Deploy. A successful deployment shows Status of Running under Running Pipelines on the Overview page.

Load and review the ready-made view

To create a View:

  1. Click + on the ribbon menu and choose View to open the Create View screen.

    • Set the View Name to equities-temp (Note that this is not the name of the final view).
    • Click Create new package on the Select a package drop-down, and enter equities-view as the Package Name.
    • Click Create
  2. Upload the ready-made View:

    1. Click here to display a JSON representation of the ready-made Equities View in a new browser window.
    2. Select all the text in the new browser window and copy it into an empty text file using the editor of your choice.
    3. Save the file on your machine. The file must be saved with a .json extension.
    4. Drag the file over the View, going to the top left-hand corner. When the workspace is highlighted in blue and Drop Dashboards to Import is displayed in the center of the workspace you can drop the file.

    5. Set the View Name to equities

    6. Click Select a package and choose equities-view.
    7. Click Create

    The View is displayed.

  3. The Chart view uses both live and historic data for the big picture in the market.

    The chart, in the top of the View, S&P Intraday, plots the price data and VWAP analytic generated by the equities-1 pipeline. This is a live feed of price and the calculated VWAP analytic

    The chart, in the lower part of the View, Daily Price Asset plots historic index data ingested by the Amazon S3 object storage pipeline. This is a time series chart of historic index data.

    The following screenshot shows the new View.
    Access the Equities view from the left-hand menu of the **Overview** page. The top chart is a live feed of price and the calculated VWAP analytic. The lower chart is a time series chart of historic index data.

  4. The Price Table tracks positions in the market.

    In the top chart, a green highlight is used if a Long (a buyer) trade, or red if Short (a seller).

    A similar highlight is used in the table on the left and in the volume bars of the lower chart.

The following screenshot illustrates the Chart view showing the strategy position in the market. Long trades are marked in green and short trades in red. The Chart view shows the strategy position in the market; long trades are marked in green and short trades in red.

Next Steps

  • Learn how to build and run a machine learning model to create stock predictions in real-time on this financial dataset in the next tutorial.

Further Reading