How to perform aggregations and filtering in queries

This page shows you how to aggregate and filter query results and tables in KDB-X.

Overview

In-built qSQL aggregations
Grouping with by
Filtering with fby
Bucketing

qSQL aggregations

sum is one of many in-built qSQL aggregations. Other in-built aggregations include, but are not limited to:

avg - average (mean)
med - median
min - minimum value
max - maximum value
count - number of values

Refer to the Mathematics and statistics reference for more built-in aggregation functions.

Sample aggregation queries:

q)show t:([]b:`s`g`a`s`a;c:30 10 43 13 24;g:til 5)
b c  g
------
s 30 0
g 10 1
a 43 2
s 13 3
a 24 4

q)select sum c, sum g from t
c   g
------
120 10

q)select maxC:max c, minG:min g from t
maxC minG
---------
43   0

Grouping with by

Unlike SQL, qSQL lets you group and aggregate separately. The easiest way to group similar values together is to use the by clause.

When used without an aggregation, by returns a list of values from the selected column.

To extract useful insights from data, grouping is best used with aggregation functions as shown below:

q)show t:([]b:`s`g`a`s`a;c:30 10 43 13 24;g:til 5)
b c  g
------
s 30 0
g 10 1
a 43 2
s 13 3
a 24 4

q)select c by b from t
b| c
-| -----
a| 43 24
g| ,10
s| 30 13

q)select c, g by b from t
b| c     g
-| ---------
a| 43 24 2 4
g| ,10   ,1
s| 30 13 0 3

q)select sum c, sum g by b from t
b| c  g
-| ----
a| 67 6
g| 10 1
s| 43 3

q)select avgC:avg c, maxG:max g by b from t
b| avgC maxG
-| ---------
a| 33.5 4
g| 10   1
s| 21.5 3

Filtering with fby

fby is a q function short for filter-by.

It is commonly used within a qSQL where clause, to apply an aggregation function to values from one column based on groups defined in another column(s).

Think of fby when you find yourself trying to apply a filter to the aggregated column of a table produced by:

select … by …

That is, use fby when trying to apply a filter on groups.

The syntax of fby is (aggregation;data) fby group.

If you are grouping on multiple columns, the syntax is (aggregation;data) fby ([]col1;...;coln).

Examples:

q)show trade:([]time:asc 09:00+01:00*1+til 12;sym:12#`IBM`AAPL`MSFT;exchange:(3#`CME),9#`N;price:10*1+til 12;size:100*10+til 12)
time  sym  exchange price size
------------------------------
10:00 IBM  CME      10    1000
11:00 AAPL CME      20    1100
12:00 MSFT CME      30    1200
13:00 IBM  N        40    1300
14:00 AAPL N        50    1400
15:00 MSFT N        60    1500
16:00 IBM  N        70    1600
17:00 AAPL N        80    1700
18:00 MSFT N        90    1800
19:00 IBM  N        100   1900
20:00 AAPL N        110   2000
21:00 MSFT N        120   2100

q)/ filter last row by sym from table
q)select from trade where i=(last;i) fby sym
time  sym  exchange price size
------------------------------
19:00 IBM  N        100   1900
20:00 AAPL N        110   2000
21:00 MSFT N        120   2100

q)/ filter rows where price >= avg price per sym
q)select from trade where price>=(avg;price) fby sym
time  sym  exchange price size
------------------------------
16:00 IBM  N        70    1600
17:00 AAPL N        80    1700
18:00 MSFT N        90    1800
19:00 IBM  N        100   1900
20:00 AAPL N        110   2000
21:00 MSFT N        120   2100

q)/ filter rows where price >= avg price per sym/exchange combination
q)select from trade where price>=(avg;price) fby ([]sym;exchange)
time  sym  exchange price size
------------------------------
10:00 IBM  CME      10    1000
11:00 AAPL CME      20    1100
12:00 MSFT CME      30    1200
16:00 IBM  N        70    1600
17:00 AAPL N        80    1700
18:00 MSFT N        90    1800
19:00 IBM  N        100   1900
20:00 AAPL N        110   2000
21:00 MSFT N        120   2100

Bucketing

You can bucket or group data based on time intervals in q using the xbar function.

If we wanted to get the last price and total volume traded in 10 minute buckets, we would run:

q)trade:([]time:asc 2025.10.20D09:00+0D00:01*100?100;sym:100?`IBM`AAPL`MSFT;price:100?100f;size:100?1000)

q)10#select from trade where sym=`IBM
time                          sym price    size
-----------------------------------------------
2025.10.20D09:00:00.000000000 IBM 68.55156 971
2025.10.20D09:13:00.000000000 IBM 13.91127 288
2025.10.20D09:15:00.000000000 IBM 56.47709 184
2025.10.20D09:17:00.000000000 IBM 53.65595 918
2025.10.20D09:18:00.000000000 IBM 2.800259 610
2025.10.20D09:25:00.000000000 IBM 8.102522 25
2025.10.20D09:26:00.000000000 IBM 65.50997 948
2025.10.20D09:27:00.000000000 IBM 38.55824 981
2025.10.20D09:30:00.000000000 IBM 98.05118 157
2025.10.20D09:31:00.000000000 IBM 26.22009 579

q)3#select last price, sum size by 10 xbar time.minute from trade where sym=`IBM
minute| price    size
------| -------------
09:00 | 68.55156 971                    / 09:00 - 09:09:59.99 bucket
09:10 | 2.800259 2000                   / 09:10 - 09:19:59.99 bucket
09:20 | 38.55824 1954                   / 09:20 - 09:29:59.99 bucket

If we wanted to get the volume weighted average price and last price traded in 15 minute buckets, we would run:

q)select lastPx:last price, vwapPx:size wavg price by 15 xbar time.minute from trade where sym=`IBM
minute| lastPx   vwapPx
------| -----------------
09:00 | 13.91127 56.05244
09:15 | 38.55824 44.05012
09:30 | 1.555512 24.05598
09:45 | 70.10042 49.45901
10:00 | 41.3496  54.44861
10:15 | 99.16535 78.97235
10:30 | 47.14092 67.92183

If we wanted to compute the total volume traded every hour, we would run:

q)select sum size by 60 xbar time.minute from trade where sym=`IBM
minute| size
------| ----
09:00 | 7891
10:00 | 8931

Summary

In this guide, you:

Learned how to use in-built qSQL aggregation functions to aggregate data
Learned how to group and aggregate data using the by clause
Learned how to filter data with fby
Learned how to bucket data in time intervals using xbar

You now have the essential skills to perform aggregations and filtering in queries in KDB-X.