How to perform a Non-Transformed TSS search

This page details how to execute a Non-Transformed Temporal Similarity Search (Non-Transformed TSS) search in KDB.AI.

Tip: For the best experience, we recommend reading about KDB.AI Non-Transformed TSS first.

Before we dive in, go to the Understanding Non-Transformed TSS search page to learn about this method.

To use the Non-Transformed TSS search, you don't need to extract vectors from the time series. The algorithm performs the following actions:

Takes simple time series (numerical sequence stored in a kdb+ column) as input.
Scans the time series with a sliding window (of same size as the query vector; size can change between two queries).
Computes the list of distances between the query vector and each occurrence of the sliding window.
Returns the k-nearest neighbors.

Setup

Before you start, make sure you have:

An active KDB.AI Cloud or Server license
Installed the latest version of KDB.AI Cloud or Server
A valid API key if you're using KDB.AI Cloud
Python Client

To store and search temporal data using the Non-Transformed TSS method, follow these steps:

Import dependencies
Create schema
Insert data
Perform searches

1. Import dependencies

Start by importing the following dependencies:

Python

import sys
import kdbai_client as kdbai
from pprint import pprint # for pretty printing
import pandas as pd
import numpy as np

2. Create schema

Open a KDB.AI session to create a schema:

PythonRESTq

session = kdbai.Session()
session.database('default').tables # check what tables are already created

schema = [
            {"name": "realTime", "type": "datetime64[ns]"},
            {"name": "sym", "type": "str"},
            {"name": "price", "type": "float64"},
            {"name": "size", "type": "int32"},
            ]

table = session.database('default').create_table('trade', schema)

curl -s -X POST http://localhost:8082/api/v2/databases/default/tables \
--header 'Content-Type: application/json' \
--data '{  
"table":"trade",
"schema": [
    {
    "name": "realTime",
    "type": "timestamp"
    },
    {
    "name": "sym",
    "type": "symbol"
    },
    {
    "name": "price",
    "type": "float"
    },
    {
    "name": "size",
    "type": "int"
    }
]
    }' | jq .

`gw set hopen 8082;

dims:10;
mySchema:flip `name`type!(`realTime`sym`price`size;`p`s`f`j);

// create
p:`database`table`schema!(`default;`trade;mySchema);
gw(`createTable;p);

or alternatively, if you have an existing kdb+ table on disk and would like to create from it, run below:-

PythonRESTq

trade = db.create_table(table="trade",external_data_references=[{"path":b'/db', "provider" :"kx"}])

curl -s -X POST http://localhost:8082/api/v2/databases/default/tables \
--header 'Content-Type: application/json' \
--data '{  
"table":"trade",
"externalDataReferences": [{
    "path": "/db",
    "provider": "kx"
    }]
    }' | jq .

`gw set hopen 8082;

ref:enlist `path`provider!("/db";`kx);
p:`database`table`externalDataReferences!(`default;`trade;ref);
gw(`createTable;p);

3. Insert data

Create the data df that contains the time series column price:

Pythonq

numRows = 40

df = pd.DataFrame()
df['realTime'] = sorted(np.random.randint(sys.maxsize, size=numRows).astype('datetime64[ns]'))
df['sym'] = np.random.choice(['aaa', 'bbb'], size=numRows).astype('str')
df['price'] = [x.astype('float64') for x in np.random.rand(numRows)]
df['size'] = np.random.randint(100, size=numRows).astype('int32')

N:100; 
t:([] realTime:asc N?0p;sym:N?`3; price:N?1f; size:til N);

Insert df into the table:

PythonRESTq

table.insert(df)

curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/insert \
--header 'Content-Type: application/json' \
--data '{  
    "payload": [
            {
            "realTime": "2001.01.01D00:48:57.051633652",
            "sym": "aaa",
            "price": 1.1,
            "size": 42
            },
            {
            "realTime": "2002.01.01D00:48:57.051633652",
            "sym": "bbb",
            "price": 2.2,
            "size": 36
            },
            {
            "realTime": "2003.01.01D00:48:57.051633652",
            "sym": "ccc",
            "price": 3.4,
            "size": 24
            },
            {
            "realTime": "2004.01.01D00:48:57.051633652",
            "sym": "ddd",
            "price": 4.7,
            "size": 11
            }
        ]
    }' | jq .

r:gw(`insertData;`database`table`payload!(`default;`trade;t));

Run a query to check the contents of the table:

PythonRESTq

table.query()

curl -s -X POST localhost:8082/api/v2/databases/default/tables/trade/query | jq .

(gw(`query;`database`table!(`default;`trade)))[`result];

4. Perform searches

Now you can conduct a similarity search (searching along either the dense column) as below:

PythonRESTq

# single query search
table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4]]}, n=3, type="tss")[0]

# multiple queries search
table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4],[7,1,2,3,4,7,1,2,3,4]]}, n=3, type="tss")

# single query search
curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/search \
--header 'Content-Type: application/json' \
--data '{  
    "vectors":{"price" : [[1.2,2.2,3.2]]},
    "n": 1,
    "type": "tss"
    }' | jq .

# multiple queries search
curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/search \
--header 'Content-Type: application/json' \
--data '{  
    "vectors":{"price" : [[1.3,2.3,3.3], [1.4,2.4,3.4]]},
    "n": 1,
    "type": "tss"
    }' | jq .

tqry1:enlist[`price]!enlist enlist 1.1 1.2 1.3; // single query search
tqry2:enlist[`price]!enlist (1.1 1.2 1.3;2.1 2.2 2.3); // multiple queries search
gw(`search;`database`table`vectors`n`type!(`default;`trade;tqry1;10;`tss));
gw(`search;`database`table`vectors`n`type!(`default;`trade;tqry2;10;`tss));

If it is a partitioned table and you would like to force search even thought the query is longer than the number of rows in some partitions, add force to options:-

PythonRESTq

trade.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4]]}, n=5, type="tss", options={'force':True})[0]

curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/search \
--header 'Content-Type: application/json' \
--data '{  
    "vectors":{"price" : [[1.2,2.2,3.2]]},
    "n": 1,
    "type": "tss",
    "options":{"force" : true}
    }' | jq .

first (gw(`search;`database`table`vectors`n`type`options!(`default;`trade;tqry1;3;`tss;(enlist `force)!(enlist 1b))))[`result];

You can also perform an outlier search along the dense column using a negative n:

PythonRESTq

table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4]]}, n=3, type="tss")[0] # similarity search
table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4]]}, n=-3, type="tss")[0] # outlier search

# similarity search
curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/search \
--header 'Content-Type: application/json' \
--data '{  
    "vectors":{"price" : [[1.2,2.2,3.2]]},
    "n": 1,
    "type": "tss"
    }' | jq .

# outlier search
curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/search \
--header 'Content-Type: application/json' \
--data '{  
    "vectors":{"price" : [[1.2,2.2,3.2]]},
    "n": -1,
    "type": "tss"
    }' | jq .

tqry1:enlist[`price]!enlist enlist 1.1 1.2 1.3;
gw(`search;`database`table`vectors`n`type!(`default;`trade;tqry1;3;`tss));  // similarity search
gw(`search;`database`table`vectors`n`type!(`default;`trade;tqry2;-3;`tss)); // outlier search

Summary

By putting the above snippets of create/insert/search together, we obtain the below example snippet for the Non-transformed TSS method. If you're already familiar with the basic usage of KDB.AI, we attached a snippet with a Non-transformed TSS case so you can compare the two. Feel free to switch between the two tabs to spot the differences.

Example: Non-Transformed TSS search

PythonRESTq

import sys
import kdbai_client as kdbai
from pprint import pprint # for pretty printing
import pandas as pd
import numpy as np

session = kdbai.Session()
session.database('default').tables # check what tables are already created

schema = [
            {"name": "realTime", "type": "datetime64[ns]"},
            {"name": "sym", "type": "str"},
            {"name": "price", "type": "float64"},
            {"name": "size", "type": "int32"},
            ]

table = session.database('default').create_table('trade', schema)

numRows = 40

df = pd.DataFrame()
df['realTime'] = sorted(np.random.randint(sys.maxsize, size=numRows).astype('datetime64[ns]'))
df['sym'] = np.random.choice(['aaa', 'bbb'], size=numRows).astype('str')
df['price'] = [x.astype('float64') for x in np.random.rand(numRows)]
df['size'] = np.random.randint(100, size=numRows).astype('int32')

table.insert(df)
table.query()

table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4]]}, n=5, type="tss")[0]
table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4],[7,1,2,3,4,7,1,2,3,4]]}, n=5, type="tss")
table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4]]}, n=3, type="tss")[0] # similarity search
table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4]]}, n=-3, type="tss")[0] # outlier search

table.search(vectors={'price': [[0,1,2,3,4,0,1,2,3,4]]}, n=3, type="tss", options={"returnMatches":True})[0]  # return original values

curl -s -X POST http://localhost:8082/api/v2/databases/default/tables \
--header 'Content-Type: application/json' \
--data '{  
"table":"trade",
"schema": [
    {
    "name": "realTime",
    "type": "timestamp"
    },
    {
    "name": "sym",
    "type": "symbol"
    },
    {
    "name": "price",
    "type": "float"
    },
    {
    "name": "size",
    "type": "int"
    }
]
    }' | jq .

curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/insert \
--header 'Content-Type: application/json' \
--data '{  
    "payload": [
            {
            "realTime": "2001.01.01D00:48:57.051633652",
            "sym": "aaa",
            "price": 1.1,
            "size": 42
            },
            {
            "realTime": "2002.01.01D00:48:57.051633652",
            "sym": "bbb",
            "price": 2.2,
            "size": 36
            },
            {
            "realTime": "2003.01.01D00:48:57.051633652",
            "sym": "ccc",
            "price": 3.4,
            "size": 24
            },
            {
            "realTime": "2004.01.01D00:48:57.051633652",
            "sym": "ddd",
            "price": 4.7,
            "size": 11
            }
        ]
    }' | jq .

curl -s -X POST localhost:8082/api/v2/databases/default/tables/trade/query | jq .

# single query search
curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/search \
--header 'Content-Type: application/json' \
--data '{  
    "vectors":{"price" : [[1.2,2.2,3.2]]},
    "n": 1,
    "type": "tss"
    }' | jq .

# multiple queries search
curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/search \
--header 'Content-Type: application/json' \
--data '{  
    "vectors":{"price" : [[1.3,2.3,3.3], [1.4,2.4,3.4]]},
    "n": 1,
    "type": "tss"
    }' | jq .

# outlier search
curl -s -X POST http://localhost:8082/api/v2/databases/default/tables/trade/search \
--header 'Content-Type: application/json' \
--data '{  
    "vectors":{"price" : [[1.2,2.2,3.2]]},
    "n": -1,
    "type": "tss"
    }' | jq .

`gw set hopen 8082;

dims:10;
mySchema:flip `name`type!(`realTime`sym`price`size;`p`s`f`j);

// create
p:`database`table`schema!(`default;`trade;mySchema);
gw(`createTable;p);

// insert
N:100; 
t:([] realTime:asc N?0p;sym:N?`3; price:N?1f; size:til N);
r:gw(`insertData;`database`table`payload!(`default;`trade;t));

(gw(`query;`database`table!(`default;`trade)))[`result];

// search
tqry1:enlist[`price]!enlist enlist 1.1 1.2 1.3; // single query search
tqry2:enlist[`price]!enlist (1.1 1.2 1.3;2.1 2.2 2.3); // multiple queries search
first (gw(`search;`database`table`vectors`n`type!(`default;`trade;tqry1;10;`tss)))[`result];
(gw(`search;`database`table`vectors`n`type!(`default;`trade;tqry2;10;`tss)))[`result];
first (gw(`search;`database`table`vectors`n`type!(`default;`trade;tqry1;-10;`tss)))[`result]; // outlier search

first (gw(`search;`database`table`vectors`n`type`options!(`default;`trade;tqry1;10;`tss;enlist[`returnMatches]!enlist 1b)))[`result]; // return original pattern

As you can see in the above comparison, the main grammatical differences between running the Non-Transformed TSS search vs. other cases are:

	Non-Transformed TSS	Transformed TSS or Non-TSS
`type`	`tss`	`flat`, `hnsw` etc.
`dims`	Not required	Required
Entries in the search column	Scalars	Vectors
`pytype` of the search column	`float64`	`float32`
Outlier search	Available	N/A

Next steps

Now that you're familiar with a Non-Transformed TSS search, try the following:

Explore best practices and use cases on the KDB.AI Learning hub.
Discover our GitHub repo, open the sample or run the notebook directly in Google Colab.
Run the pattern matching notebook in Google Colab.
Download the pattern matching Jupyter notebook and accompanying files from GitHub.
Watch this YouTube video about Temporal Similarity Search for vector databases.