kxi.sp.decode

Stream Processor decoders.

CSVHeader Objects

class CSVHeader(AutoNameEnum)

Enum for csv header options.

These enum values can be provided as enum member objects (e.g. CSVHeader.always), or as strings matching the names of the members (e.g. 'always').

none

Encoded data never starts with a header row.

always

Encoded data always starts with a header row.

arrow

@Decoder
def arrow(*, as_list: Union[bool, kx.BooleanAtom] = False) -> Decoder

Decodes Arrow streams.

Arguments:

as_list - A boolean where if true, the decoded result is a list of arrays, corresponding only to the Arrow stream data. If false, the decoded result is a table corresponding to both the schema and data in the Arrow stream.

Returns:

An arrow decoder, which can be joined to other operators or pipelines.

csv

@Decoder
def csv(schema: Union[str, kx.Table, dict[str, Union[str, bytes,
                                                     kx.CharAtom]]],
        delimiter: Union[str, bytes, kx.CharAtom] = kx.q('","'),
        *,
        header: CSVHeader = CSVHeader.always,
        exclude: Union[List[str], List[int], kx.SymbolVector,
                       kx.IntegralVector] = kx.q('`long$()'),
        schema_type: Union[str, bytes, kx.CharVector] = 'literal') -> Decoder

Parses CSV data to a table.

Arguments:

schema - A dictionary of column names where values are the desired output type, a dictionary of columns and their type characters, or a list of type characters.
delimiter - A field separator for the records in the encoded data, defaults to comma.
header - Whether encoded data starts with a header row.
exclude - Columns to exclude from the output, specified by name or by index.
schema_type - Either 'literal' or 'schema' indicating how to interpret the schema parameter.

Returns:

A csv decoder, which can be joined to other operators or pipelines.

json

@Decoder
def json(*, decode_each: bool = False) -> Decoder

Parse JSON data.

JSON data should be a char vector or byte vector, and will be parsed using the .j.k function from q.

Arguments:

decode_each - By default messages passed to the decoder are treated as a single JSON object. Setting decode_each to true indicates that parsing should be done on each value of a message. This is useful when decoding data that has objects separated by newlines. This allows the pipeline to process partial sets of the JSON file without requiring the entire block to be in memory.

Returns:

A json decoder, which can be joined to other operators or pipelines.

protobuf

@Decoder
def protobuf(message: Union[str, bytes, kx.SymbolAtom],
             *,
             path: Optional[Union[Path, str, kx.SymbolAtom]] = None,
             format: Optional[CharString] = None,
             as_list: Union[bool, kx.BooleanAtom] = False) -> Decoder

Parse Protocol Buffer messages.

Arguments:

message - The name of the Protocol Buffer message type to decode.
path - A path to a .proto file containing the message type definition.
format - The Protocol Buffer message format to decode.
as_list - Whether the output should be a list of values instead of a dict.

Returns:

A protobuf decoder, which can be joined to other operators or pipelines.