Stats

These features are in beta, and must be enabled by setting the $KXI_SP_BETA_FEATURES environment variable to "yes".

.qsp.stats describe calculate specific statistics ema calculate an exponential moving average sma calculate a simple moving average twa calculate a time weighted average

`.qsp.stats.describe`

.qsp.stats.describe[fields; stats]

Parameters:

name	type	description	default
fields	symbol or symbol[]	A list of column names to compute statistics on	Required
stats	symbol, symbol[], or list of tuples and symbols	A list of statistics which should be computed	Required

Statistic Options

name	type	description
minimum	symbol	Computes the maximum of each provided column
maximum	symbol	Computes the minimum of each provided column
range	symbol	Computes the range of each provided column
length	symbol	Counts the length of the batch provided
total	symbol	Computes the total sum of each provided column
average	symbol	Computes the average of each provided column
numDistinct	symbol	Counts the number of distinct elements in each provided column
numNull	symbol	Counts the number of null elements in each provided column
numInfinity	symbol	Counts the number of infinite elements in each provided column
median	symbol	Computes the median of each provided column
quartiles	symbol	Computes the quartiles of each provided column
frequency	symbol	Creates a frequency dictionary for each provided column
mode	symbol	Computes all modes of each provided column
sampleVar	symbol	Computes the sample variance of each provided column
sampleStd	symbol	Computes the sample standard deviation of each provided column
populationVar	symbol	Computes the population variance of each provided column
populationStd	symbol	Computes the population standard deviation of each provided column
standardError	symbol	Computes the standard error of each provided column
skew	symbol	Computes the Fisher-Pearson coefficient of skewness of each provided column
percentiles	tuple	Computes the specified percentiles on each provided column

Note: some statistics do not support categorical data and will return generic null for said data

For all common arguments, refer to configuring operators

This operator computes the requested descriptive statistics on the provided columns

This example computes the min, max, and average on a batch of data

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.stats.describe[`y; `minimum`maximum`average]
    .qsp.write.toVariable[`output];

publish ([] x: til 5; y: 10 13 1 9 8)
output
Expected output: ([] minimum_y: enlist 1; maximum_y: enlist 13; average_y: enlist 8.2)

This example demonstrates how to use the percentiles option The operator below will compute the mode and skew along with the 90^th, 95^th and 99^th percentile.

Enlist for percentiles

If only percentiles are to be computed, the tuple must be enlisted.

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.stats.describe[`x; (`mode; `skew; (`percentiles; 0.9 0.95 0.99))]
    .qsp.write.toVariable[`output];

publish ([] x: til 100)
output

`.qsp.stats.ema`

.qsp.stats.ema[X; alpha; y]

Parameters:

name	type	description	default
X	symbol or symbol[]	A list of column names on which to compute the average	Required
alpha	float	The decay rate	Required
y	symbol or symbol[]	The columns to write to. These can overwrite existing columns	The same as X

For all common arguments, refer to configuring operators

This calculates the exponential moving average for each data point.

This example replaces the columns x and y with their exponential moving averages.

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.stats.ema[`x`y; .33]
    .qsp.write.toConsole[];

publish ([] x: til 10; y: 0 1 4 2 5 3 6 7 9 8)

`.qsp.stats.sma`

.qsp.stats.sma[X; n; y]

Parameters:

name	type	description	default
X	symbol or symbol[]	A list of column names on which to compute the average	Required
n	long	The number of records to include in the average	Required
y	symbol or symbol[]	The columns to write to. These can overwrite existing columns	The same as X

For all common arguments, refer to configuring operators

This calculates, for each data point, the arithmetic mean of a moving window including that point and the n-1 prior data points.

This example replaces each value in y with the simple moving average of that value and the nine prior values.

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.stats.sma[`y; 10]
    .qsp.write.toConsole[];

publish ([] x: til 10; y: 0 1 4 2 5 3 6 7 9 8)

`.qsp.stats.twa`

.qsp.stats.twa[X; times; range; y]

Parameters:

name	type	description	default
X	symbol or symbol[]	A list of column names on which to compute the average	Required
times	symbol	The name of the column containing the time data	Required
range	long, int or short	The number of records to include in the average	Required
y	symbol or symbol[]	The columns to write to. These can overwrite existing columns	Same as X

For all common arguments, refer to configuring operators

This calculates, for each data point, the arithmetic mean of a moving window including that point and the n-1 prior data points weighted by the time deltas found in times.

Data must be sorted

The incoming data must be sorted, because the average is calculated using the deltas between each timestamp. Out of order data would cause negative weight to be applied to the calculation.

This example replaces each value in y with the time weighted average of that value and the nine prior values using weights derived from the time column.

.qsp.run
    .qsp.read.fromCallback[`publish]
    // The windowing is to ensure that records are sorted by timestamp
    .qsp.window.tumbling[00:01:00; `time; .qsp.use `sort`lateness!(1b; 00:00:10)]
    .qsp.stats.twa[`data; `time; 10]
    .qsp.write.toConsole[]

publish ([] time: 0p + 00:00:01 * 0 5 6 17 14 21 57 58 71;
            data: 10 20 10 9 11 8 21 10 9)

This example replaces each value in c and in d with the time weighted average of the values within a and b respectively and four prior values using the times column as a series of times.

.qsp.run
    .qsp.read.fromCallback[`publish]
    .qsp.window.tumbling[00:00:01; `time; .qsp.use `sort`lateness!(1b; 00:00:01)]
    .qsp.stats.twa[`a`b; `time; 5; `c`d]
    .qsp.write.toConsole[];

publish ([] time: 0p + 00:00:00.1 * 0 8 13 17 19 21; a: 1 7 8 7 7 8; b: til 6);