Skip to content

Encoders

.qsp.encode.arrow

(Beta Feature) Serialize data into Arrow format

Beta Features

To enable beta features, set the environment variable KXI_SP_BETA_FEATURES to true.

.qsp.encode.arrow[]
.qsp.encode.arrow[.qsp.use (!) . flip (
    (`schema      ; schema);
    (`matchCols   ; matchCols);
    (`payloadType ; payloadType))]

options:

name type description default
schema table or dict If a schema is provided, then it is used when serializing for every batch of data, giving improved performance. If not provided, schema is inferred at runtime for each batch of data. ::
matchCols bool This option is only used if a schema is provided and table data is passed in. If set to true, tables can be passed in with any column ordering, and the columns are reordered to match the provided schema. If false, no reordering is done, improving performance. 1b
payloadType symbol Indicates the message payload that will be encoded (one of auto, table, or arrays). `auto

For all common arguments, refer to configuring operators

This operator encodes kdb+ table or array data into an Arrow stream.

Only data made up of inferred list types is supported. These types can be found at https://code.kx.com/q/interfaces/arrow/arrow-types . Importantly, there is support for data containing only vector columns/arrays. However for general list type column/arrays, support is only provided for string or byte list columns/arrays.

Encode a kdb+ table to an Arrow stream, where the schema and payload type are inferred:

input:([]c1: til 5; c2: 10.1 10.2 10.3 10.4 10.5; c3: (enlist "a";"bbb";"ccc";"ddd";enlist "e"))

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.encode.arrow[]
    .qsp.write.toConsole[]

publish input
0xffffffffd80000001000000000000a000c0006000500..

Encode data to an Arrow stream, where the schema is specified and payload type is arrays :

schema: ([] c1: "j"$(); c2: "f"$(); c3: "h"$())
payLoadType: `arrays
input: (1 2 3j; 10.1 10.2 10.3f; 10 20 30h)

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.encode.arrow[.qsp.use (!) . flip (
        (`schema; schema);
        (`payloadType; payLoadType)
        )]
    .qsp.write.toConsole[]

publish input
0xffffffffe00000001000000000000a000c0006000500..

.qsp.encode.csv

Convert data into CSV format

.qsp.encode.csv[]
.qsp.encode.csv[delimiter]
.qsp.encode.csv[delimiter; .qsp.use enlist[`header]!enlist header]

Parameters:

name type description default
delimiter character Field separator for the records in the encoded data. ","

options:

name type description default
header symbol Whether encoded data starts with a header row, either none, first or always. first

For all common arguments, refer to configuring operators

This operator encodes tables into CSV data - strings with delimiter-separated values

Encode to CSV from a table:

// Generate a random table of data.
n: 10000
t: ([]
  date: n?.z.d + neg til 10;
  price: (n?1000)div 100;
  item: n?`3;
  description: {rand[100]?.Q.an}each til n;
  quantity: n?10000)

.qsp.run
  .qsp.read.fromCallback[`publish]
  .qsp.encode.csv["|"]
  .qsp.write.toConsole[]

publish t
2022.11.02D09:08:38.553535126 | "date|price|item|description|quantity"
2022.11.02D09:08:38.553535126 | "2022-11-01|2|mkp|F5e|9406"
2022.11.02D09:08:38.553535126 | "2022-10-31|1|igb|1lYkM405Ugfdk_VZLS6XW9zgtkqyNzgoB603WpmS9xV6Hx3Mtjd|4648"
2022.11.02D09:08:38.553535126 | "2022-10-28|9|kij|41EYGoQFLXlPVw50EWPvnkjSyB9HaB7qYtRcu69n|2261"
2022.11.02D09:08:38.553535126 | "2022-10-30|5|dih|DDBVC3uUbhpwdgEQsIt|8552"
2022.11.02D09:08:38.553535126 | "2022-11-02|4|dpb|zXYM60LQRdCQliV1HTtkZmgLrSNLqcKqj0teyswl61vmFkiC1MyatnwgGiAhtXGmX5QBVrOjFHQUt9s|8941"
2022.11.02D09:08:38.553535126 | "2022-10-26|1|hfa|HZ6fSzhC8rt_Zqd18|316"
..

We also allow Encoding to CSV from a dictionary where all values have the same count:

show d: flip 3?t
date       | 2022.10.30  2022.10.30  2022.10.29
price      | 4           4           1
item       | gih         gih         gdi
description| "tFO1lLCuD" "tFO1lLCuD" "E3XKfjSRYr4MJW3RYtbinQTB3vjVIWlXbvRhUllaoIbviWXBYEb4OH2CsOvWaJd6Cqiyndg"
quantity   | 3815        3815        4770

publish d
2022.11.03D11:26:45.226049104 | "2022-10-30|4|gih|tFO1lLCuD|3815"
2022.11.03D11:26:45.226049104 | "2022-10-30|4|gih|tFO1lLCuD|3815"
2022.11.03D11:26:45.226049104 | "2022-10-29|1|gdi|E3XKfjSRYr4MJW3RYtbinQTB3vjVIWlXbvRhUllaoIbviWXBYEb4OH2CsOvWaJd6Cqiyndg|4770"

.qsp.encode.json

Serialize data into JSON format

.qsp.encode.json[]
.qsp.encode.json[.qsp.use enlist[`split]!enlist split]

options:

name type description default
split boolean By default, batches are encoded as single JSON objects. Split encodes each value in a given batch separately. When the input is a table, this encodes each row as its own JSON object. 0b

For all common arguments, refer to configuring operators

This operator encodes a table as a JSON array of dictionaries.

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.encode.json[]
    .qsp.write.toVariable[`output]

publish ([] date: .z.d; time: .z.t; price: 3?100f; quantity: 3?100f)

// Format the output to help with viewing
-1 "},\n {" sv "},{" vs output;
[{"date":"2022-02-10","time":"17:46:36.545","price":48.29742,"quantity":73.15636},
 {"date":"2022-02-10","time":"17:46:36.545","price":51.24386,"quantity":3.691923},
 {"date":"2022-02-10","time":"17:46:36.545","price":51.5731,"quantity":97.58372}]

Encodes a table row by row in JSON

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.encode.json[.qsp.use``split!11b]
    .qsp.write.toVariable[`output]

publish ([] date: .z.d; time: .z.t; price: 3?100f; quantity: 3?100f)

// View the output
-1 output;
{"date":"2022-02-10","time":"17:46:36.545","price":48.29742,"quantity":73.15636}
{"date":"2022-02-10","time":"17:46:36.545","price":51.24386,"quantity":3.691923}
{"date":"2022-02-10","time":"17:46:36.545","price":51.5731,"quantity":97.58372}

.qsp.encode.protobuf

Serialize data into Protocol Buffers

.qsp.encode.protobuf[message; file]
.qsp.encode.protobuf[message; file; .qsp.use (!) . flip (
    (`format     ; format);
    (`payloadType; payloadType))]

Parameters:

name type description default
message string or symbol The name of the Protocol Buffer message type to encode. Required
file symbol The path to a .proto file containing the message type definition. Required

options:

name type description default
format string A string definition of the Protocol Buffer message format to encode. ""
payloadType symbol Indicates the message payload that will be encoded (one of auto, table, dict, array, or arrays). auto

For all common arguments, refer to configuring operators

This operator encodes Protocol Buffer encoded messages from dictionaries or arrays of values.

Import paths

To import your .proto file, the folder containing the .proto file is added as an import path. This means the folder will be scanned when importing future .proto files, so it is important that you avoid having .proto files with the same filename present in import paths you use.

Encode Protobuf messages using a Person.proto file:

// Person.proto
syntax="proto3";
message Person {
  string name = 1;
  int32 id = 2;
  string email = 3;
}
.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.encode.protobuf[`Person;"Person.proto"]
    .qsp.write.toConsole[];

publish `name`id`email!("name"; 101i; "email@email.com")
2021.11.10D12:01:49.089788900 | "\n\004name\020e\032\017email@email.com"

Encode Protobuf messages using format from arrays:

format: "syntax=\"proto3\";
 message Person {
    string name = 1;
    int32 id = 2;
    string email = 3;
 }";

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.encode.protobuf[.qsp.use `message`format!(`Person;format)]
    .qsp.write.toConsole[];

publish ("name"; 101i; "email@email.com")
2021.11.10D12:01:49.089788900 | "\n\004name\020e\032\017email@email.com"