# C client for kdb+

There are three cases in which to to use the C API for kdb+:

1. Dynamically-loaded library called by q, e.g. OS, math, analytics.
Using C functions
2. Dynamically-loaded library doing callbacks into q, e.g. feedhandlers (Bloomberg client)
3. C/C++ clients talking to kdb+ servers (standalone applications), e.g. feedhandlers and clients. Links with c.o/c.dll.

## Two sets of files

To minimize dependencies for existing projects, there are now two sets of files available.

The e set of files, those with SSL/TLS support, contain all the functionality of the c files.

Do not link with both c and e files; just choose one set.

### Linux

capability dependencies 32-bit 64-bit
no SSL/TLS l32/c.o l64/c.o
SSL/TLS OpenSSL l32/e.o l64/e.o

### macOS

capability dependencies 32-bit 64-bit
no SSL/TLS m32/c.o m64/c.o
SSL/TLS OpenSSL m32/e.o m64/e.o

### Windows

c.lib is a stub library which loads c.dll and resolves the functions dynamically; e.lib does the same for e.dll.

We no longer ship c.obj or cst.obj; they have been replaced by c_static.lib and cst_static.lib, and are complemented by e_static.lib and est_static.lib – these static libraries have no dependency on the aforementioned DLLs.

cst continues to represent ‘single-threaded’ apps, those which on Windows have issues due to the LoadLibrary API.

## Overview

The best way to understand the underpinnings of q, and to interact with it from C, is to start with the header file available from KxSystems/kdb/c/c/k.h .

This is the file you will need to include in your C or C++ code to interact with q from a low level.

Let’s explore the basic types and their synonyms that you will commonly encounter when programming at this level. First though, it is worth noting the size of data types in 32- versus 64-bit operating systems to avoid a common mistake.

To provide succinct composable names, the q header defines synonyms for the common types as in the following table:

type synonym
16-bit int H
32-bit int I
64-bit int J
char* S
unsigned char G
char C
32-bit float E
64-bit double F
void V

With this basic knowledge, we can now tackle the types available in q and their matching C types and accessor functions provided in the C interface. We will see shortly how the accessor functions are used in practice.

q type name q type number encoded type name C type size in bytes interface list accessor function
mixed list 0 - K - kK
boolean 1 KB char 1 kG
guid 2 UU U 16 kU
byte 4 KG char 1 kG
short 5 KH short 2 kH
int 6 KI int 4 kI
long 7 KJ int64_t 8 kJ
real 8 KE float 4 kE
float 9 KF double 8 kF
char 10 KC char 1 kC
symbol 11 KS char* 4 or 8 kS
timestamp 12 KP int64_t 8 kJ
month 13 KM int 4 kI
date 14 KD int 4 kI (days from 2000.01.01)
datetime 15 KZ double 8 kF (days from 2000.01.01)
timespan 16 KN int64_t 8 kJ (nanoseconds)
minute 17 KU int 4 kI
second 18 KV int 4 kI
time 19 KT int 4 kI (milliseconds)
table/flip 98 XT - - x->k
dict/table with primary keys 99 XD - - kK(x)[0] for keys and kK(x)[1] for values
error -128 - char* 4 or 8 x->s

Note that the type numbers given are for vectors of that type. For example, 9 for vectors of the q type float. By convention, the negative value is an atom: -9 is the type of an atom float value.

## The K object structure

The q types are all encapsulated at the C level as K objects. (Recall that k is the low-level language underlying the q language.) K objects are all instances of the following structure (note this is technically defining K objects as pointers to the k0 structure but we’ll conflate the terms and refer to K objects as the actual instance).

• for V3.0 and later
typedef struct k0{
signed char m,a;   // m,a are for internal use.
signed char t;     // The object's type
C u;               // The object's attribute flags
I r;               // The object's reference count
union{
// The atoms are held in the following members:
G g;H h;I i;J j;E e;F f;S s;
// The following members are used for more complex data.
struct k0*k;
struct{
J n;            // number of elements in vector
G G0[1];
};
};
}*K;

• prior to V3.0 it is defined as
typedef struct k0 {
I r;                   // The object's reference count
H t, u;                // The object's type and attribute flags
union {                // The data payload is contained within this union.
// The atoms are held in the following members:
G g;H h;I i;J j;E e;F f;S s;
// The following members are used for more complex data.
struct k0*k;
struct {
I n;              // number of elements in vector
G G0[1];
};
};
}*K;


As an exercise, it is instructive to count the minimum and the maximum number of bytes a K object can use on your system, taking into account any padding or alignment constraints.

Given a K object x, we can use the accessors noted in the table above to access elements of the object. For example, given a K object containing a vector of floats, we can access kF(x)[42] to get the 42nd element of the vector. For accessing atoms, use the following accessors:

byte x->g boolean, char
short x->h
int x->i month, date, minute, second, time
long x->j timestamp, timespan
real x->e
float x->f datetime
symbol x->s error

Changes in V3.0

The k struct changed with the release of V3.0, and if you are compiling using the C library (c.o/c.dll) stamped on or after 2012.06.25 you should ensure you use the correct k struct by defining KXVER accordingly, e.g.

gcc -D KXVER=3 …

If you need to link against earlier releases of the C library, you can obtain those files from the earlier version of 2011.04.20.

## Examining K objects

Whether you know beforehand the type of the K objects, or you are writing a function to work with different types, it is useful to dispatch based on the type flag x->t for a given K object x.

Where x->t is:

• negative, the object is an atom, and we should use the atom accessors noted above.
• greater than zero, we use the vector accessors as all the elements are of the same type (eg. x->t == KF for a vector of q floats).
• exactly zero, the K object contains a mixed list of other K objects. Each item in the list is a pointer to another K object. To access each item of x we use the kK object accessor. For example: kK(x)[42] to access the 42nd element of the mixed list.

## Nulls and infinities

The next table provides the null and infinite immediate values for the q types. These are constants defined in k.h.

type null infinity
short 0xFFFF8000 (nh) 0x7FFF (wh)
int 0x80000000 (ni) 0x7FFFFFFF (wi)
long 0x8000000000000000 (nj) 0x7FFFFFFFFFFFFFFF (wj)
float log(-1.0) on Windows or (0/0.0) on Linux (nf) -log(0.0) in Windows or (1/0.0) on Linux (wf)

Null objects can be created using ks(""),kh(nh),ki(ni),kj(nj),kc(" "), etc. A null guid can be created with U g={0};ku(g);

## Managing memory and reference counting

Although memory in q is managed for the programmer implicitly, when interfacing from C or C++ we must (as is usual in those languages) manage memory explicitly. The following functions are provided to interface with the q memory manager.

purpose function
Increment the object‘s reference count r1(K)
Decrement the object‘s reference count r0(K)
Free up memory allocated for the thread‘s pool m9()
Set whether interning symbols uses a lock setm(I)

A reference count indicates the usage of an object, allowing the same object to be used by more than one piece of code.

If you create a K object through one of the ‘generator’ functions (ki, kj, knk, etc), you automatically have a reference to that object. Once you have finished using that object, you should call r0.

r0(ki(5));


creates and immediately destroys an integer object.

Initialise the kdb+ memory system

Before calling any 'generator' functions in a standalone application, you must initialise the kdb+ internal memory system. (It is done automatically when you open a connection to other kdb+ processes.) Without making a connection, use khp("",-1);

In the case of a function being called from q

K myfunc(K x)
{
return ki(5);
}


the object is returned to q, and q will eventually decrement the reference count.

In this scenario, the arg x from q is passed to the C function. If it is to be returned to q, the reference count must be incremented with r1.

K myfunc(K x)
{
return r1(x);
}


It is vital to increment and decrement when adding or removing references to values that should be managed by the q runtime, to avoid memory leaks or access faults due to double frees.

Note that K objects must be freed from the thread they are allocated within, and m9() should be called when the thread is about to complete, freeing up memory allocated for that thread's pool. Furthermore, to allow symbols to be created in other threads, setm(1) should be called from the main thread before any other threads are started.

When a K object is created, it usually has a reference count of 0 – exceptions are common constants such as (::) which may vary in their current reference count, as they may be used by other areas of the C API library or q. If r0 happens to be passed a K object with a reference count of 0, that object’s memory is freed (returned to an internal pool). Be aware that if a reference count is >0, you should very likely not change the data stored in that object as it is being referenced by another piece of code which may not expect the change. In this case, create a new copy of the object, and change that.

If in doubt, the current reference count can be seen in C with

printf("Reference count for x is %d\n",x->r);


and in q with

-16!x


The function k, as in

K r=k(handle,"functionname",params,(K)0);


requires a little more explanation.

If the handle is

• ≥0, it is a generator function, and can return 0 (indicating a network error) or a pointer to a k object. If that object has type -128, it indicates an error, accessible as a null-terminated string in r->s. When you have finished using this object, it should be freed by calling r0(r).
• <0, this is for async messaging, and the return value can be either 0 (network error) or non-zero (success). This result should not be passed to r0.

K objects passed as parameters to the k function call have their reference counts decremented automatically on the return from that call. (To continue to use the object later in that C function, after the k call, increment the reference count before the call.)

K r=k(handle,"functionname",r1(param),(K)0);


## Creating atom values

To create atom values the following functions are available. Function ka creates an atom of the given type, and the rest create an atom with the given value:

purpose call
Create an atom of type K ka(I);
Create a boolean K kb(I);
Create a guid K ku(U);
Create a byte K kg(I);
Create a short K kh(I);
Create an int K ki(I);
Create a long K kj(J);
Create a real K ke(F);
Create a float K kf(F);
Create a char K kc(I);
Create a symbol K ks(S);
Create a timestamp K ktj(-KP,J);
Create a time K kt(I);
Create a date K kd(I);
Create a timespan K ktj(-KN,J);
Create a datetime K kz(F);

An example of creating an atom:

K z = ka(-KI);
z->i = 42;


Equivalently:

K z = ki(42);


## Creating lists

To create

• a simple list K ktn(I type,J length);
• a mixed list K knk(I n,...);

where length is a non-negative, non-null integer.

Limit of length

Before V3.0. length had to be in the range 0…2147483647, and was type I. See KXVER sections in k.h.

For example, to create an integer list of 5 we say ktn(KI,5). A mixed list of 5 items can be created with ktn(0,5) but note that each element must be initialized before further usage. A convenient shortcut to creating a mixed list when all items already exist at the creation of the list is to use knk, e.g. knk(2,kf(2.3),ktn(KI,10)). As we've noted, the type of a mixed list is 0, and the elements are pointers to other K objects – hence it is mandatory to initialize those n elements either via knk params, or explicitly setting each item when created with ktn(0,n).

To join

• an atom to a list: K ja(K*,V*);
• a string to a list: K js(K*,S);
• another K object to a list: K jk(K*,K);
• another K list to the first: K jv(K*,K);

The join functions assume there are no other references to the list, as the list may need to be reallocated during the call. In case of reallocation passed K* pointer will be updated to refer to new K object and returned from the function.

K x=ki(42);
K list=ktn(0,0);
jk(&list,x); // append a k object to a list

K vector=ktn(KI,0);
int i=2;
ja(&vector,&i); // append a primitive int to an int vector

K syms=ktn(KS,0);
S sym=ss("IBM");
js(&syms,sym); // append an interned symbol to a symbol vector

K more=ktn(KS,2);
kS(more)[0]=ss("INTC");
kS(more)[1]=ss("GOOG");
jv(&syms,more); // append a vector with two symbols to syms


## Strings and datetimes

Strings and datetimes are special cases and extra utility functions are provided:

purpose function
Create a char array from string K kp(string);
Create a char array from string of length n K kpn(string, n);
Intern a string S ss(string);
Intern n chars from a string S sn(string,n);
Convert q date to yyyymmdd integer I dj(date);
Encode a year/month/day as q date
0==ymd(2000,1,1)
I ymd(year,month,day);

Recall that Unix time is the number of seconds since 1970.01.01D00:00:00 while q time types have an epoch of 2000.01.01D00:00:00.

q)long$timestamp$2000.01.01
0
q)writeFd:($"./efd")2:(writeFd;2) q)fd:newFd 0 / arg is start value of eventfd counter q)onCallback:{0N!(x;y)} q)writeFd[fd;3] / increments the eventfd counter by 3, triggering the callback later  This demonstrates the deferred invocation of onCallback until q has at least finished processing the current handle or script. In situations where you can’t hook a feedhandler’s callbacks directly into sd1, on Linux eventfd may be a viable option for you. Callbacks from sd1 are executed on the main thread of q. Windows developers may be interested in ncm/selectable-socketpair. Callbacks from sd1 are executed on the main thread of q, in the handle context (.z.w) of the registered handle, and hence are also subject to permissions checks: ## Serialization and deserialization The K b9(I,K) and K d9(K) functions serialize and deserialize K objects. b9(mode,kObject);  will generate a K byte vector that contains the serialized data for kObject. Since V3.0, for shared libraries loaded into q the value for mode must be -1. For standalone applications binding with c.o/c.dll, or shared libraries prior to V3.0, the values for mode can be value effect -1 valid for V3.0+ for serializing/deserializing within the same process 0 unenumerate, block serialization of timespan and timestamp (For working with versions prior to 2.6). 1 retain enumerations, allow serialization of timespan and timestamp. (Useful for passing data between threads). 2 unenumerate, allow serialization of timespan and timestamp 3 unenumerate, compress, allow serialization of timespan and timestamp d9(kObject);  will deserialize the byte stream in kObject returning a new kObject. The byte stream passed to d9 is not altered in any way. If you are concerned that the byte vector that you wish to deserialize may be corrupted, call okx to verify it is well formed first. unsigned char bytes[]={0x01,0x00,0x00,0x00,0x0f,0x00,0x00,0x00,0xf5,0x68,0x65,0x6c,0x6c,0x6f,0x00}; // -8!hello K r,x=ktn(KG,sizeof(bytes)); memcpy(kG(x),bytes,x->n); int ok=okx(byteVector); if(ok){ r=d9(byteVector); r0(x); } else perror("bad data");  ## Miscellaneous The K dot(K x, K y) function is the same as the q function .[x;y]. q).[{x+y};(1 2;3 4)] 4 6  The dynamic link, K dl(V* f, I n), function takes a C function that would take n K objects as arguments and return a new K object, and returns a q function. It is useful, for example, to expose more than one function from an extension module. #include "k.h" Z K1(f1){R r1(x);} Z K2(f2){R r1(y);} K1(lib){K y=ktn(0,2);x=ktn(KS,2);xS[0]=ss("f1");xS[1]=ss("f2"); kK(y)[0]=dl(f1,1);kK(y)[1]=dl(f2,2);R xD(x,y);}  Alternatively, for simpler editing of your lib API: #define sdl(f,n) (js(&x,ss(#f)),jk(&y,dl(f,n))) K1(lib){K y=ktn(0,0);x=ktn(KS,0);sdl(f1,1);sdl(f2,2);R xD(x,y);}  With the above compiled into lib.so: q).lib:(:lib 2:(lib;1)) q).lib.f1 42 42 q).lib.f2 . 42 43 43  ## Debugging with gdb It can be a struggle printing q values from a debugger, but you can call the handy k.h macros in gdb like xt, xC, xK, … If your client is a shared library, you might get away with p k(0,"show",r1(x),(K)0) GDB Manual: §12. C Preprocessor Macros Now, we compile the program using the GNU C compiler, gcc. We pass the -gdwarf-21 and -g3 flags to ensure the compiler includes information about preprocessor macros in the debugging information. $ gcc -gdwarf-2 -g3 sample.c -o sample
$ Now, we start gdb on our sample program: $ gdb -nw sample
GNU gdb 2002-05-06-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, ...
(gdb)


And all you need is

gcc -g3 client.c -o client
gdb ./client


… get signal, go up stack frame with up:

Thread 1 "sdl" received signal SIGSEGV, Segmentation fault.
0x000000000040711b in nx ()
(gdb) up
#1  0x0000000000407411 in nx ()
(gdb)
#2  0x0000000000407411 in nx ()
(gdb)
#3  0x0000000000408a15 in b9 ()
(gdb)
#4  0x0000000000409ac2 in ww ()
(gdb)
#5  0x0000000000409d33 in k ()
(gdb)
#6  0x000000000040410d in main (n=1, v=0x7fffffffdf68) at sdl.c:108
108     }else if(e.type==SDL_USEREVENT){K x=e.user.data1;A(!xt);A(xn==2);k(-c,"{value[x]y}",xK[0]->s,xK[1],(K)0);}


Now use k.h macros!

(gdb) p xt
$20 = 0 '\000' (gdb) p xn$21 = 2


so it’s a q list. Show two elements:

(gdb) p xK[0]->t
$23 = -11 '\365' (gdb) p xK[0]->s$24 = (S) 0x2078b98 "blink"
(gdb) p xK[1]->t
$25 = -7 '\371' (gdb) p xK[1]->j$27 = 0


which is a bit easier than:

(gdb) p *(((K*)(x->G0))[0])
$14 = {m = 0 '\000', a = 1 '\001', t = -11 '\365', u = 0 '\000', r = 0, {g = 152 '\230', h = -29800, i = 34048920, j = 34048920, e = 9.95829503e-38, f = 1.6822401649996936e-316, s = 0x2078b98 "blink", k = 0x2078b98, {n = 34048920, G0 = ""}}} (gdb) p *(((K*)(x->G0))[1])$13 = {m = 0 '\000', a = 0 '\000', t = -7 '\371', u = 0 '\000', r = 0, {g = 0 '\000', h = 0, i = 0, j = 0, e = 0, f = 0, s = 0x0, k = 0x0,
{n = 0, G0 = "\002"}}}


## Windows and the LoadLibrary API

The q multithreaded C library (c.dll) uses static thread-local storage (TLS), and is incompatible with the LoadLibrary Win32 API. If you are writing an Excel plugin, this point is relevant to you, as loading of the plugin uses this mechanism.

Microsoft Knowledge Base: PRB: Calling LoadLibrary() to Load a DLL That Has Static TLS

When trying to use the library, the problem manifests itself as a crash during the khpu() call.

Hence Kx also provides at KxSystems/kdb a single-threaded version of this library as w32/cst.dll and w64/cst.dll, which do not use TLS. To use this library:

• download cst.dll and cst.lib
• rename them to c.dll/c.lib
• relink and ensure that c.dll is in your path

If in doubt whether the c.dll you have uses TLS, run

dumpbin /EXPORTS c.dll


and look for a .tls entry under the summary section. If it is present it uses TLS and is the wrong library to link with use with Excel add-ins.

...
Summary

4000 .data
1000 .rdata
1000 .reloc
1000 .rsrc
7000 .text
1000 .tls


In some cases 2: may fail because of missing dependencies. Sadly, OS error messages are not always helpful.