Skip to content

Streaming subway data to a Data Grid in Views

Beta - For evaluation and trial use only

Streaming to Views is currently in beta.

The purpose of this walkthrough is to showcase the ability of kdb Insights Enterprise to visualize streaming data in real-time in a View, by streaming the latest co-ordinates of a set of NY City subway trains to a data grid.

We have provided a Kafka subway feed for use in this walkthrough. This feed generates live alerts for NYC subway trains, tracking which trains are stopped at stations. For each train it provides arrival time, station location coordinates, direction and route details.

In this walkthrough you will:

Create a pipeline

Begin by opening the Pipeline editor by clicking [+] on the ribbon menu and clicking Pipeline, as shown below, or by clicking [+] beside Pipelines on the left-hand menu.

Click pipeline from the ribbon dropdown menu to open the pipeline editor.

The pipeline editor opens, as shown below.

pipeline editor.

The pipeline requires the following:

Once these are configured you can save and deploy the pipeline. You can then:

Add a Kafka node

You begin your pipeline with a Kafka node to read the data from the subway feed we have provided.

  1. Click-and-drag a Kafka node, from the Readers, into the workspace.

  2. Select the Kafka node and add the following connection details to the Configure Kafka Node panel:

    setting value
    Broker kafka.trykdb.kx.com:443
    Topic subway
    Offset End
    Use TLS Unchecked
    Use Schema Registry Unchecked
  3. Expand the Advanced parameters section and tick Use Advanced Kafka Options. Click the [+] to add an advanced configuration and enter the following key value pairs:

    key value
    sasl.username demo
    sasl.password demo
    sasl.mechanism SCRAM-SHA-512
    security.protocol SASL_SSL
  4. Click Apply to apply these changes to the Kafka node.

Add a Decoder node

Kafka event data is in JSON and has to be decoded to a kdb+ friendly format (a kdb+ dictionary).

  1. Click-and-drag a JSON decoder node, from the Decoders, into the workspace, and connect it to the Reader node.

  2. Keep the default JSON decoder settings.

Add a Map node

The decoded data needs to be converted to a kdb+ table. The Map node converts an incoming dictionary to a table using an enlist.

  1. Click-and-drag a Map node, from the Functions, into the workspace, and connect it to the Decoder node.

  2. Select the Map node and replace the code in the right-hand Configure Map Node property panel with the following code:

    {[data]
        enlist data
        }
    
  3. Click Apply to apply these changes to the Map node.

Add a Transform Node

The next step is to add a Transform node which transforms the subway data fields to kdb+ compatible types and drops the fields that are not going to be displayed in the View.

Converting strings to symbols for use as keyed columns in the Subscriber node

The data from Kafka that is used in this example includes strings. The Subscriber node requires any columns defined as keyed columns to be symbols. This step is required to convert the strings to symbols.

If you are using a Subscriber node with your own data and the data in your keyed columns are already symbols this step is not necessary. See the next section for details on keyed columns.

  1. Click-and-drag the Apply Schema node from the list of Transform nodes and connect it to the Decoder node.

  2. In the Schema section click the [+] icon under the "add schema columns below" text. Then add the following columns, leaving Parse Strings set to Auto for all columns:

    column name colomn type
    arrival_time Timestamp
    stop_name Symbol
    stop_lat Float
    stop_lon Float
    direction_id Symbol
    route_long_name Symbol
    route_color Symbol
  3. Click Apply to apply changes to the node.

Add a Subscriber node

A Subscriber node emits a summarized view of the input data, with one record per unique combination of values in the list of keyed columns. The remaining columns are set to their latest values.

Once deployed, the emitted stream of data is available as a data source in a View. This is described in the Create a View section later.

In this example the node is configured so that it emits updates to the co-ordinates of trains per route and direction of travel (i.e. inbound or outbound).

  1. Click-and-drag the Subscriber node, from the list of Writers into the central workspace, and connect it to the Apply Schema node.

  2. Select the Subscriber node and set the Table to subway.

  3. Add the following Keyed Column values:

    value
    route_long_name
    direction_id

    Defining these keyed columns ensures that:

    • The output of the Subscriber node has one record per train route and direction.

    • You can filter these columns in the View, as you can only filter on Keyed Columns.

  4. Click Apply to apply these settings.

Configure the pipeline settings

An environment variable needs to be added to the pipeline to ensures the subscriber node updates the stream every 10th of a second.

  1. Click on the Settings tab for the pipeline.

  2. Scroll down to the Environment Variables and click Add Variable. Add the following variable details.

    variable value details
    KXI_SP_SUBSCRIBER_PUBLISH_INTERVAL 100 This determines how regularly the subscriber updates the web-socket

Save and deploy the pipeline

The pipeline is now configured and ready to be saved and deployed.

  1. Enter subway-streaming in the Name field in the top left of the workspace.

  2. Click Save & Deploy in the top panel.

  3. Check the progress of the pipeline under the Running Pipelines panel of the Overview tab. The data is ready to visualize when the Status is set to Running. Note it may take a few minutes for the pipeline to start running.

Create a View

Now that the data is being streamed to a pipeline containing a Subscriber node, you can create a View with a Data Grid to display the locations of the subway trains that are stopped at stations.

  1. Select Visualize on the Overview page, or View from the ribbon menu, as shown below.

    Select View from the menu to build a visualization.

  2. Click-and-drag a Data Grid, which is the first component on the list, from the icon menu into the central workspace.

    Click-and-drag a datagrid into the workspace

  3. In the Data Grid component, Click to populate Data Source to open the Data editor.

  4. Click New to create a new data source and name it streaming.

  5. Click on the Streaming radio button and set the following properties:

value setting
pipeline This is automatically populated with any running pipelines with a Subscriber node. Select subway-streaming.
table This is automatically populated with tables associated with the pipeline selected. Choose subway.
filters leave empty
  1. Click Execute to check the results of the new query. The results are displayed in the Results panel at the bottom of the screen, as shown below. Verify that the streaming position values, with updates visible as data is streamed, are displayed.

Streaming properties

  1. Click Apply and then Select Item.

  2. With the Data Grid selected, set the Sort Column, in the Basics properties, to route_long_name.

  3. Save the View by clicking the Save Dashboard icon at the top of the workspace.

The Data Grid now displays a record per train showing the most recent stop along the NY City subway.

Streaming to a data grid

Filtering the Data Grid

Now that you have a View that is being updated with streamed data you can add filters, based on any of the Keyed Columns, to show a subset of the data.

You can only filter on the Keyed Columns, filtering on any other column causes a timeout.

  1. With the Data Grid selected in the new View, open the Data Source and choose the streaming Data Source from the list on the left-hand side.

  2. Type in one of the filters below, which are examples of how filtering can be applied:

    filter filter description
    {"direction_id":"inbound"} All inbound trains
    {"route_long_name":"6 Avenue Express"} All trains on the 6 Avenue Express route

    Current filter restrictions

    • Filters must be defined using the following JSON format: {"keyed_column":"value"}.
    • Only one keyed column can be filtered on.
    • Only one value can be applied to the keyed column.
  3. Click Execute to check that the results, displayed in the Results panel at the bottom of the screen, have applied the filter to the data.

  4. Click Apply and then Select Item.

The Grid now only lists records that match the filter being applied. The following example shows the grid being updated with data for all inbound trains only. Streaming data inbound trains

Advanced Filtering using View States

If you are familiar with View States you can pair the value in this filter with another component, for example a dropdown, so that you can choose the filter value in the View.

Add a map

If you have a access to an API Key for Google Maps Platform you can add a Map component to your View to display the subway data.

This step is optional. If you want to proceed you must create an API Key for Google Maps Platform as described here.

To add a map to your View:

  1. In the design mode for your View, search for the Map component in the list of components, on the left, and drag it into the workspace.

    Find the map component in the list of dashboard components.

  2. Click on the Map component and configure it as follows:

    1. Click "set Google Maps JavaScript API Key in Map Key Property". The Map Key is saved as a view state inside your view.

      The Map Key is saved as a view state inside your view.

    2. Click New and name the node mapkey.

    3. In the Properties section for mapkey, leave Type set to Symbol. Enter your API Key for Google Maps into the Default and Value properties and click Select Item.

      The Map Key is saved as a view state inside your view.

    4. Centralize the map on New York by setting the values, in the following table, in the Map properties. These are illustrated in the screenshot below the table.

      setting value
      CenterX 40.75
      CenterY -73.98

      Define Map properties.

  3. Define the Data Source from which the component will get data:

  4. Click on Data Source in Points properties.

  5. Create a new data source and name it streaming-map.

!!!note "If you are want to share the filtering options that you setup for the Data Grid, select the existing streaming data source here."

  1. Click on the Streaming radio button and set the following properties:

    value     | setting
    --------- | -------
    pipeline  | subway-streaming
    table     | subway
    filters   | leave empty
    
  2. Click Execute to check the results are displayed in the Results panel at the bottom of the screen, as shown below. This displays the streaming position values with updates visible as data is streamed.

    Streaming properties

  3. Click Apply and then Select Item.

  4. To ensure the trains are named, positioned and coloured on the map based on the streaming data set the Points properties as follows:

    properties value
    Selected Attr route_long_name
    Latitude Data stop_lat
    Longitude Data stop_long
    Shape train
    Shape Colour (From DB) route_colour
    Cluster Unchecked
  5. Save the View by clicking the Save Dashboard icon at the top of the workspace.

Your View now contains a data grid and a map updating for all inbound and outbound trains, as illustrated below.

Streaming View

Further Reading