White paper
Java API for kdb+¶
The Java programming language has been consistently popular for two decades, and is important in many development environments. Its longevity, and the compatibility of code between versions and operating systems, leaves the landscape of Java applications in many industries very much divided between new offerings and long-established legacy code.
Financial technology is no exception. Competition in this risk-averse domain drives it to push against boundaries. Production systems inevitably mix contemporary and legacy code. Because of this, developers need tools for communication and integration. Implementation risks must be kept to a strict minimum. KX technology is well-equipped for this issue. By design kdb+’s communication with external processes is kept simple, and reinforced with interface libraries for other languages.
The Java API for kdb+ is a Java library. It fits easily in any Java application as an interface to kdb+ processes. As with any API, potential use cases are many. To introduce kdb+ gradually into a wider system, such an interface is essential for any interaction with Java processes, upstream or downstream. The straightforward implementation keeps changes to legacy code lightweight, reducing the risk of wider system issues arising as kdb+ processes are introduced.
This paper illustrates how the Java API for kdb+ can be used to enable a Java program to interact with a kdb+ process. It first explores the API itself: how it is structured, and how it might be included in a development project. Examples are then provided for core use cases for the API in a standard setup. Particular consideration is given to how the API facilitates subscription and publication to a kdb+ tickerplant process, a core component of any kdb+ tick-capture system.
The examples presented here form a set of practical templates. These templates can be combined and adapted to apply kdb+ across a broad range of problem domains. They are available on GitHub.
API overview¶
The API is contained in a single source file on GitHub.
Inclusion in a development project is, therefore, a straightforward matter
of including the file with other source code under the package kx
, and
ensuring it is properly imported and referenced by other classes. If
preferred, it can be compiled separately into a class or JAR file to be
included in the classpath for use as an external library or uploaded to
a local repository for build integration.
As the API is provided as source, it is perfectly possible to customize code to meet specific requirements.
However, without prior knowledge of how the interactions work, this is not advised unless the solution to these requirements or issues are known.
It is also possible, and in some contexts encouraged, to wrap the
functionality of this class within a model suitable for your framework.
An example might be the open-source qJava library.
Although it is not compatible with the most recent kdb+ version at the time of writing, it shows how to use c.java
as a core over which an object-oriented framework of q types and functionality has been applied.
The source file is structured as a single outer class, c
.
Within it, a number of constants and inner classes together model an
environment for sending and receiving data from a kdb+ process.
This section explores the fundamentals of the class to provide context and understanding of practical use-cases for the API.
Connection and interface logic¶
The highly-recommended means of connecting to a kdb+ process using the API is through instantiation of the c
object itself.
Three constructors provide for this purpose:
public c(String host,int port,String usernamepassword)
public c(String host,int port,String usernamepassword,boolean useTLS)
public c(String host,int port)
These constructors are straightforward to use.
The host and port specify a socket-object connection, with the username/password string serialized and passed to the remote instance for authorization.
The core logic is the same for all; the host/port-only constructor attempts to retrieve the user string from the Java properties, and the constructor with the useTLS
boolean will, when flagged true, attempt to use an SSL socket instead of an ordinary socket.
It is also possible to set up the object to accept incoming connections
from kdb+ processes rather than just making them. There are two
constructors which, when passed a server socket reference, will allow a
q session to establish a handle against the c
object:
public c(ServerSocket s)
public c(ServerSocket s,IAuthenticate a)
IAuthenticate
is an interface within the c
class that can be
implemented to emulate kdb+ server-side authentication, allowing the
establishment of authentication rules similar to that which might be
done through the kdb+ function .z.pw.
Both of these constructor families represent two ‘modes’ in which
the c
object can be instantiated. The first, and ultimately most
widely used, is for making connections to kdb+ processes, which
naturally would be used for queries, subscriptions and any task that
requires the reception of or sending of data to said processes. The
second, which sees Java act as the server, would see utility in
management and aggregation of kdb+ clients, perhaps as a data sink or
an intermediary interface for another technology.
Interactions between Java and kdb+ through these connections are
largely handled by what might be called the ‘k’ family of methods in
the c
class. There are thirteen combined methods and overloads that
fall under this group. They can be divided roughly into four groups:
Synchronous query methods¶
public Object k(String expr)
public Object k(String s,Object x)
public Object k(String s,Object x,Object y)
public void k(String s,Object x,Object y,Object z)
public synchronized Object k(Object x)
These methods are responsible for handling synchronous queries to a kdb+
process. The String parameter will represent either the entire q
expression or the function name; in the case of the latter, the Object
parameters may be used to pass values into that function. In all
instances, the String/Object combinations are merged into a single
object to be passed to the synchronized k(Object)
method.
Asynchronous query methods¶
public void ks(String expr)
public void ks(String s,Object x)
public void ks(String s,Object x,Object y)
public void ks(String s,Object x,Object y,Object z)
public void ks(Object obj)
These methods are responsible for handling asynchronous queries to a
kdb+ process. They operate logically in a similar manner to the
synchronous query method, with the exception that they are, of course,
void methods in that they neither wait for nor return any response from
the process.
Incoming message method¶
public Object k()
This method waits on the class input stream and will deserialize the
next incoming kdb+ message. It is used by the c
synchronous methods in
order to capture and return response objects, and is also used in
server-oriented applications in order to capture incoming messages from
client processes.
Response message methods¶
public void kr(Object obj)
public void ke(String text)
These methods are typically used in server-oriented applications to
serialize and write response messages to the class output stream.
kr(Object)
will act much like any synchronous response, while ke(String)
will format and output an error message.
The use of these constructors and methods will be treated in more practical detail through the use-case examples below.
Models and type mapping¶
The majority of q data types are represented in the API through mapping
to standard Java objects. This is best seen in the method
c.r()
,
which reads bytes from an incoming message and converts those bytes into
representative Java types.
Basic types¶
The method c.r()
deserializes a stream of bytes within a certain range to point
to further methods which return the appropriate typed object. These are
largely self-explanatory, such as booleans and integer primitives
mapping directly to one another, or q UUIDs mapping to java.util.UUID
.
There are some types with caveats, however:
-
The kdb+ float type (9) corresponds to
java.lang.Double
and notjava.lang.Float
, which corresponds to the kdb+ real type (8). -
Java strings map to the kdb+ symbol type (11). In terms of reading or passing in data, this means that passing
"String"
from Java to kdb would result in`String
. Conversely, passing"String"
(type 10 list) from kdb to Java would result in a six-index character array.
Time-based types¶
Of particular interest is how the mapping handles temporal types, of which there are eight:
q type | id | Java type | note |
---|---|---|---|
datetime | 15 | java.util.Date |
This Java class stores times as milliseconds passed since the Unix epoch. Therefore, like the q datetime, it can represent time information accurate to the millisecond. (This despite the default output format of the class). |
date | 14 | java.sql.Date | While this Java class extends the java.util date object it is used specifically for the date type as it restricts usage and output of time data. |
time | 19 | java.sql.Time |
This also extends java.util.Date , restricting usage and output of date data this time. |
timestamp | 12 | java.sql.Timestamp |
This comes yet again from the base date class, extended this time to include nanoseconds storage (which is done separately from the underlying date object, which only has millisecond accuracy). This makes it directly compatible with the q timestamp type. |
month | 13 | inner class c.Month |
|
timespan | 16 | inner class c.Timespan |
|
minute | 17 | inner class c.Minute |
|
second | 18 | inner class c.Second |
When manipulating date, time and datetime data from kdb+ it is important
to note that while java.sql.Date
and Time
extend java.util.Date
, and can
be assigned to a java.util
reference, that many of the methods from the
original date class are overridden in these to throw exceptions if
invoked. For example, in order to create a single date object for two
separate SQL Date and Time objects, a java.util.Date
object should be
instantiated by adding the getTime()
values from both SQL objects:
//Date value = datetime - time
java.sql.Date sqlDate = (java.sql.Date)qconn.k(".z.d");
// Time value - datetime - date
java.sql.Time sqlTime = (java.sql.Time)qconn.k(".z.t");
java.util.Date utilDate= new java.util.Date(sqlDate.getTime()+sqlTime.getTime());
The four time types represented by inner classes are somewhat less
prevalent than those modeled by Date and its subclasses. These classes
exist as comparable models due to a lack of a clear representative
counterpart in the standard Java library, although their modeling is for
the large part fairly simple and the values can be easily implemented or
extracted.
Dictionaries and tables¶
Kdb+ dictionaries (type 99) and tables (type 98) are represented by the internal classes Dict and Flip respectively. The makeup of these models is simple but effective, and useful in determining how best to manipulate them.
The Dict class
consists of two public java.lang.Object
fields (x
for keys, y
for
values) and a basic constructor, which allows any of the represented
data types to be used. However, while from a Java perspective any object
could be passed to the constructor, dictionaries in q are always
structured as two lists. This means that if the object is being created
to pass to a q session directly, the Object fields in a Dict object
should be assigned arrays of a given representative type, as passing in
an atomic object will result in an error.
For example, the first of the following dictionary instantiation is legal with regards to the Java object, but because the pairs being passed in are atomic, it would signal a type error in q. Instead, the second example should be used, and can be seen as mirroring the practice of enlisting single values in q:
new c.Dict("Key","Value"); // not q-compatible
new c.Dict(new String[] {"Key"}, new String[] {"Value"}); // q-compatible
As the logical extension of that, in order to represent a list as a
single key or pair, multi-dimensional arrays should be used:
new c.Dict(new String[] {"Key"}, new String[][] {{"Value1","Value2","Value3"}});
Flip (table) objects
consist of a String array for columns, an Object array for values, a
constructor and a method for returning the Object array for a given
column. The constructor takes a dictionary as its parameter, which is
useful for the conversion of one to the other should the dictionary in
question consist of single symbol keys. Of course, with the fields of
the class being public, the columns and values can be assigned manually.
Keyed tables in q are dictionaries in terms of type, and therefore will
be represented as a Dict object in Java. The method
td(Object)
will create a Flip object from a keyed table Dict, but will remove its
keyed nature in the process.
GUID¶
The globally unique identifier (GUID) type was introduced into kdb+ with version 3.0 for the purpose of storing arbitrary 16-byte values, such as transaction IDs. Storing such values in this form allows for savings in tasks such as memory and storage usage, as well as improved performance in certain operations such as table lookups when compared with standard types such as Strings.
Java has its own unique identifier type: java.util.UUID
(universally
unique identifier). In the API the kdb+ GUID type maps directly to this
object through the extraction and provision of its most and least
significant long values. Otherwise, the only high-level difference in
how this type can be used when compared to other types handled by the
API is that a RuntimeException
will be thrown if an attempt is made to
serialize and pass a UUID object to a kdb+ instance with a version lower
than 3.0.
More information on these identifier types can be found in the KX documentation as well as the core Java documentation.
Null types¶
Definitions for q null type representations in Java are held in the
static Object array NULL
, with index positions representing the q type.
public static Object[] NULL={
null,
new Boolean(false),
new UUID(0,0),
null,
new Byte((byte)0),
new Short(Short.MIN_VALUE),
new Integer(ni),
new Long(nj),
new Float(nf),
new Double(nf),
new Character(' '),
"",
new Timestamp(nj),
new Month(ni)
,new Date(nj),
new java.util.Date(nj),
new Timespan(nj),
new Minute(ni),
new Second(ni),
new Time(nj)
};
Of note are the integer types, as the null values for these are
represented by the minimum possible value of each of the Java
primitives. Shorts, for example, have a minimum value of -372768 in
Java, but a minimum value of -372767 in q. The extra negative value in
Java can therefore be used to signal a null value to the q connection
logic in the c
class.
Float and real nulls are both represented in Java by the
java.lang.Double.NaN
constant. Time values, essentially being longs
under the bonnet, are represented by the same null value as longs in
Java. Month, minute, second and timespan, each with custom model
classes, use the same null value as ints.
The method
c.qn(Object)
can assist with checking and identifying null value representations, as
it will check both the Object
type and value against the NULL
list.
It is worth noting that infinity types are not explicitly mapped in
Java, although kdb+ float and real infinities will correspond with the
infinity constants in java.lang.Double
and java.lang.Float
respectively.
Exceptions¶
KException
is the single custom exception defined and thrown by the API. It is
fairly safe to assume that a thrown KException
denotes a q error signal,
which will be included in the exception message when thrown.
Other common exceptions thrown in the API logic include:
- IOException
-
Denotes issues with connecting to the kdb+ process. It is also thrown by
c.java
itself for such issues as authentication. - RuntimeException
-
Thrown when certain type implementations are attempted on kdb+ versions prior to their introduction (such as the GUIDs prior to kdb+ 3.0)
- UnsupportedEncodingException
-
It is possible, through the method
setEncoding
, to specify character encoding different to the default (ISO-859-1
). This exception will be thrown commonly if the default is changed to a charset format not implemented on the target Java platform.
Practical use-case examples¶
The examples that follow consist of common practical tasks that a Java developer might be expected to carry out when interfacing with kdb+. The inline examples take the form of extracted sections of key logic and output, and are available as example classes from the KxSystems/javakdb repository for use as starting points or templates.
These examples assume, at minimum, a standard installation of 32-bit kdb+ on the local system, and a suitable Java development environment.
Connecting to a kdb+ process¶
Starting a local q server¶
During development, it can be helpful to start a basic q server to which a Java process can connect. This requires the opening of a port, for which there are two basic methods:
Example: Starting q with –p
parameter
$ q -p 10000
q)\p // command to show the port that q is listening on
10000i
Example: Using the \p
system command
$ q
q)\p 10000 // set the listening port to 10000
q)\p
10000i
To close the port, it should be set to its default value of 0 i.e. \p 0
.
Setting up a q session in this manner will allow other processes to open handles to it on the specified port. The remainder of the examples in this paper assume an opened q session listening on port 10000, with no further configuration unless otherwise specified.
Opening a socket connection¶
As discussed in the previous section, the c
class establishes
connections via its constructors.
For connecting to a listening q process, one useful mechanism might be
to create a factory class with a method that returns a connected c
object based on what is passed to it. This way, any number of credential
combinations can be set whilst allowing the creation of multiple
connections, say for reconnection purposes:
Example: QConnectionFactory.java
public QConnectionFactory(String host, int port,
String username, String password, boolean useTLS) {
this.host=host;
this.port=port;
this.username=username;
this.password=password;
this.useTLS=useTLS;
}
//[…]
public c getQConnection() throws KException, IOException {
return new c(host,port,username+":"+password,useTLS);
}
These constructors will always return a c
object connected to the target
session, and failure to do so will result in a thrown exception;
IOException
will denote the port not being open or available, and a
KException
will denote something wrong with the q process itself (such
as 'access
for incorrect or incomplete credentials).
For the remaining examples, connections will be made using a custom
QConnectionFactory
object returned from a static method getDefault()
,
which will instantiate the object with the host localhost
and the port
10000:
Example: QConnectionFactory.java
public static QConnectionFactory getDefault() {
return new QConnectionFactory("localhost", 10000);
}
Connection objects created using this will be given the variable name
qConnection
unless otherwise stated.
Running queries using k methods¶
Queries can be made using the ‘k’ family of methods in the c
class.
For synchronous queries, that might be used to retrieve data (or, more
generally, to halt execution of the java process until a response
is received), the k methods with parameter combinations of strings
and objects might be used.
For asynchronous queries, as might be used in a
feed-handler process to push data to a tickerplant, the ks
methods would
be used.
The methods k()
, kr()
and ke()
would not see explicit use in the
querying of a server q process, but are more significant when the Java
process acts as the server, as will be touched upon below.
The following examples demonstrate some of the means by which these synchronous and asynchronous queries may be called:
Example: SimpleQueryExamples.java
//Object for storing the results of these queries
Object result = null;
//Basic synchronous q expression
result = qConnection.k("{x+y}\[4;3\]");
System.out.println(result.toString());
//parameterized synchronous query
result = qConnection.k("{x+y}",4,3); //Note autoboxing!
System.out.println(result.toString());
//asynchronous assignment of function
qConnection.ks("jFunc:{x-y+z}");
//synchronous calling of that function
result = qConnection.k("jFunc",10,4,3);
System.out.println(result);
//asynchronous error - note no exception can be returned, so be careful!
qConnection.ks("{x+y}\[4;3;2\]");
//Always close resources\!
qConnection.close();
Extracting data from returned objects¶
Note on internal variables and casting¶
The relationship between the kdb+ types and their Java counterparts has been discussed in the previous section. From a practical perspective, it is important to note that almost all objects and fields that might return from a given synchronous query will be of type Object, and will therefore more often than not require casting in order to be manipulated properly. Care must be taken, therefore, to ensure that the types that can be returned from a given query are known and handled appropriately so as to avoid unwanted exceptions.
The exception to this might be the column names of a flip
object (once
cast itself) held in the field flip.x
. This field is already typed as
String[]
, as column names must always be symbols in q.
Kdb+ types that map to primitives (such as int) can be passed in Java to
a k
method as a primitive thanks to
autoboxing,
but will always be returned as the corresponding wrapper object (such as
Integer).
Extracting atoms from a list¶
Lists will always be returned as an array of the given list type, or as
Object[]
if the list is generic. Extraction of atomic values from a
list, therefore, is as simple as casting the return object to the
appropriate array type and accessing the desired index:
Example: ExtractionExamples.java
//Get a list from the q session
Object result = qConnection.k("(1 2 3 4)");
//Cast the returned Object into long[], and retrieve the desired result.
long[] castList = ((long[]) result);
long extractedAtom = castList[0];
System.out.println(extractedAtom);
If the type of list is unknown, the method c.t(Object)
can be used to
derive the q type of the object, and theoretically could be useful in
further casting efforts.
Extracting lists from a nested list¶
Accessing a list from a nested list is similar to accessing a value from
any list. Here there are two casts required: a cast to Object[]
for
the parent list and then again to the appropriate typed array for the
extracted list:
Example: ExtractionExamples.java
// Start by casting the returned Object into Object[]
Object[] resultArray = (Object[]) qConnection.k("((1 2 3 4); (1 2))");
//Iterate through the Object array
for (Object resultElement : resultArray) {
//Retrieve each list and cast to appropriate type
long[] elementArray = (long[]) resultElement;
//Iterate through these arrays to access values.
for(long elementAtom : elementArray) {
System.out.println(elementAtom);
}
}
Working with dictionaries¶
The Dict inner class is used for all returned objects of q type
dictionary (and therefore, by extension, keyed tables). Key values are
stored in the field Dict.x
, and values in Dict.y
, both of which will
generally be castable as an array.
Aside from matching the index positions of x
and y
, there is no
intrinsic key-value pairing between the two, meaning that alteration of
either of the array structures can compromise the key-value
relationship. The following example illustrates operations that might be
performed on a returned dictionary object:
Example: ExtractionExamples.java
//Retrieve Dictionary
c.Dict dict = (c.Dict) qConnection.k("`a`b`c!((1 2 3);\"Second\"; (`x`y`z))");
//Retrieve keys from dictionary
String[] keys = (String[]) dict.x;
System.out.println(Arrays.toString(keys));
//Retrieve values
Object[] values = (Object[]) dict.y;
//These can then be worked with similarly to nested lists
long[] valuesLong = (long[]) values[0];
//[…]
Working with tables¶
The inner class c.Flip
used to represent tables operates in a similar
manner to c.Dict
. The primary difference, as previously mentioned, is
that Flip.x
is already typed as String[]
, while Flip.y
will still
require casting. The following example shows how the data from a
returned Flip
object might be used to print the table to console:
Example: ExtractionExamples.java
// (try to load trade.q first for this (create a table manually if not possible)
qConnection.ks("system \"l trade.q\"");
//Retrieve table
c.Flip flip = (c.Flip) qConnection.k("select from trade where sym = `a");
//Retrieve columns and data
String[] columnNames = flip.x;
Object[] columnData = flip.y;
//Extract row data into typed arrays
java.sql.Timestamp[] time = (java.sql.Timestamp[]) columnData[0];
String[] sym = (String[]) columnData[1];
double[] price = (double[]) columnData[2];
int[] size = (int[]) columnData[3];
int rows = time.length;
//Print the table now - columns first:
for (String columnName : columnNames)
{
System.out.print(columnName + "\t\t\t");
}
System.out.println("\n-----------------------------------------------------");
//Then rows:
for (int i = 0; i < rows; i++)
{
System.out.print(time[i]+"\t"+sym[i]+"\t\t\t"+price[i]+"\t\t\t"+size[i]+"\n");
}
Creating and passing data objects¶
When passing objects to q via the c
class, there is less emphasis on how a given object is created. Rather, such an operation
is subject to the common pitfalls associated with passing values to a q
expression; those of type and rank.
The k family of methods, regardless of its return protocol, will take either the String of a q expression or the String of a q operator or function, complemented by Object parameters. Given the nature of q as an interpreted language, all of these are serialized and sent to the q session with little regard for logical correctness.
It is important, therefore, that any expressions passed to a query
method are syntactically accurate and refer to variables that actually
exist in the target session. It is also important that any passed
objects are mapped to a relevant q type, and function within the context
that they are sent. KException
messages to look out for while
implementing these operations are 'type
and 'rank
, as these will
generally denote basic type and rank issues respectively.
Creating and passing a simple list¶
The following method might be applied to all direct type mappings in the API; for simple lists (lists in which all elements are of the same type), it is enough to pass a Java array of the appropriate type.
The following example invokes the q set
function, which allows for the
passing of a variable name as well as an object with which the variable
might be set:
Example: CreateAndSendExamples.java
//Create typed array
int[] simpleList = {10, 20, 30};
//Pass array to q using set function.
qConnection.k("set", "simpleList", simpleList)
Creating and passing a mixed list¶
Mixed lists should always be passed to kdb+ through an Object array,
Object[]
. This array may then hold any number of mapped types,
including, if appropriate, other typed or Object arrays:
Example: CreateAndSendExamples.java
//Create generic Object array.
Object[] mixedList = {new String[] {"first", "second"}, new double[] {1.0, 2.0}};
//Pass to q in the same way as a simple list.
qConnection.k("set", "mixedList", mixedList);
Creating and passing dictionaries¶
c.Dict
objects are instantiated by setting its x
and y
objects in the
constructor, and these objects should always be arrays. Once created,
the Dict can be passed to kdb+ like any other object:
Example: CreateAndSendExamples.java
//Create keys and values
Object[] keys = {"a", "b", "c"};
int[] values = {100, 200, 300};
//Set in dict constructor
c.Dict dict = new c.Dict(keys, values);
//Set in q session
qConnection.k("set","dict",dict);
Creating and passing tables¶
c.Flip
objects are created slightly differently; it is best to
instantiate these by passing a c.Dict
object into the constructor. This
is because tables are essentially collections of dictionaries in kdb+,
and therefore using this constructor helps ensure that the Flip object
is set up correctly.
It is worth noting that for this method to work correctly, the passed
Dict object must use String keys, as these will map into the Flip
object’s typed String[]
columns:
Example: CreateAndSendExamples.java
//Create rows and columns
int[] values = {1, 2, 3};
Object[] data = new Object[] {values};
String[] columnNames = new String[] {"column"};
//Wrap values in dictionary
c.Dict dict = new c.Dict(columnNames, data);
//Create table using dict
c.Flip table = new c.Flip(dict);
//Send to q using 'insert' method
qConnection.ks("insert", "t1", table);
Creating and passing GUID objects¶
Globally universal identifier objects are represented in Java by
java.util.UUID
objects, and are passed to kdb+ in an identical manner as
other basic types. The Java object has a useful static method for
generating random identifiers, which further streamlines this process
and can see utility in some use cases where only a certain number of
arbitrary identifiers are required:
Example: CreateAndSendExamples.java
//Generate random UUID object
java.util.UUID uuid = java.util.UUID.randomUUID();
System.out.println(uuid.toString());
//Pass object to q using set function
qConnection.k("set","randomGUID",uuidj);
System.out.println(qConnection.k("randomGUID").toString());
Of course, it should be remembered that kdb+ version 3.0 or higher is
required to work with GUIDs, and running the above code connected to an
older version will cause a RuntimeException
to be thrown.
Reconnecting to a q process automatically¶
Requirements will often dictate that while q processes will need to be bounced (such as for End-of-Day processing), that a Java process will need to be able to handle loss and reacquisition of said processes without being restarted itself. A simple example might be a graphical user interface, where the forced shutdown of the entire application due to a dropped connection, or the lack of ability to reconnect, would be very poor design indeed.
Use of patterns such as factories can help with the task of setting up a
reconnection mechanism, as it allows for the simple creation of a
preconfigured object. For c
Objects, given that they connect on
instantiation, means that a connection can be re-established simply by
calling the relevant factory method.
In order to handle longer periods of potential downtime, either loops or recursion should be used. The danger with recursive methodology here is that, given an extended without a timeout limitation, there is a risk of overflowing the method-call stack, as each failed attempt will invoke a new method onto the stack.
For mechanisms that may need to wait indefinitely, it might be
considered safer to use an indefinite while-loop that makes use of catch
blocks, continue and break statements. This averts the danger of
StackOverflowError
occurring and is easily modified to implement a
maximum number of tries:
Example: ReconnectionExample.java
//initiate reconnect loop (possibly within a catch block).
while (true) {
try {
System.err.println("Connection failed - retrying..");
//Wait a bit before trying to reconnect
Thread.sleep(5000);
qConnection = qConnFactory.getQConnection();
System.out.println("Connection re-established! Resuming..");
//Exit loop
break;
} catch (IOException | KException e1) {
//resume loop if it fails
continue;
}
…
}
Kdb+ tickerplant overview¶
A kdb+ tickerplant is a q process specifically designed to handle incoming high-frequency data feeds from publishing process. Its primary responsibility is the management of subscription requests and the fast publication of data to subscribers. The following diagram illustrates a simple dataflow of a potential kdb+ tick system:
Building Real-time Tick Subscribers regarding the above vanilla setup
Of interest in this white paper are the Java publisher and subscriber processes. As the kdb+ tick system is very widely used, both of these kinds of processes are highly likely to come up in development tasks involving kdb+ interfacing.
Test tickerplant and feedhandler setup¶
To facilitate the testing of Java subscriber processes we can implement
example q processes freely available in the KX repository. Simulation of
a tickerplant can be achieved with
tick.q
;
Trade data, using the trade schema defined in sym.q
, can then be
published to this tickerplant using the definition for the file feed.q
given here:
// q feed.q / with a default port of 5010 and default timer of 1000
// q feed.q -port 10000 / with a default timer of 1000
// q feed.q -port 10000 -t 2000
tph:hopen $[0=count .z.x;5010;"J"$first .Q.opt\[.z.x]`port]
if[not system"t";system"t 1000"]
publishTradeToTickerPlant:{
nRows:first 1?1+til 3;
tph(".u.upd";`trade;(nRows#.z.N;nRows?`IBM`FB`GS`JPM;nRows?150.35;nRows?1000));
}
.z.ts:{
publishTradeToTickerPlant[];
}
The tickerplant and feed handlers can then be started by executing the
following commands consecutively:
$ q tick.q sym -t 2000
$ q feed.q
Once the feedhandler is publishing to the tickerplant, processes can
connect to it in order either to publish or subscribe to it.
It should be noted that in this example and below we are using a Java process to subscribe to a tickerplant being fed directly by a simulated feed. While we are doing this here in order to facilitate a simple example setup, in production this is not usually encouraged. Processes such as Java subscribers would generally connect to derivative kdb+ processes such as chained tickerplants (as in the above diagram), for which standard publishing and subscription logic should be the same as that covered here.
Tickerplant subscription¶
Extracting the table schema¶
Typical subscriber processes are required to make an initial subscription request to the tickerplant in order to receive data.
See the publish and subscribe Knowledge Base article for details.
This request involves calling the .u.sub
function with two
parameters. The first parameter is the table name and the second is a
list of symbols for subscription. (Specifying a backtick in any of the
parameters means all tables and/or all symbols).
Example: TickSubscriberExample.java
// Run sub function and store result
Object[] response = (Object[]) qConnection.k(".u.sub[`trade;`]");
If the .u.sub
function is called synchronously, the tickerplant will
return the table schema. If subscribing to one table, the returned
object will be a generic Object array, with the table name in
object[0]
and a c.Flip
representation of the schema in object[1]
:
Example: TickSubscriberExample.java
// first index is table name
System.out.println("table name: " + response[0]);
// second index is flip object
c.Flip table = (c.Flip) response[1];
// Retrieve column names
String[] columnNames = table.x;
for (int i = 0; i < columnNames.length; i++) {
System.out.printf("Column %d is named %s\n", i, columnNames[i]);
}
If more than one table is being subscribed to, the returned object will
be an Object array consisting of the above object arrays; therefore, in
order to retrieve each individual Flip object, this should be iterated
against:
Example: TickSubscriberExample.java
// Run sub function and store result
Object[] response = (Object[]) qConnection.k(".u.sub[`;`]");
// iterate through Object array
for (Object tableObjectElement : response) {
// From here, it is similar to the one-table schema extraction
Object[] tableData = (Object[]) tableObjectElement;
System.out.println("table name: " + tableData[0]);
c.Flip table = (c.Flip) tableData[1];
String[] columnNames = table.x;
for (int i = 0; i < columnNames.length; i++) {
System.out.printf("Column %d is named %s\n", i, columnNames[i]);
}
}
Subscribing to a tickerplant data feed¶
Upon calling .u.sub
and retrieving the schema, the tickerplant process
will start to publish data to the Java process. The data it sends can be
retrieved through the parameter-free k()
method, which will wait for a
response and return an Object (a c.Flip
of the passed data) on
publication:
Example: TickSubscriberExample.java
while (true) {
//wait on k()
Object response = qConnection.k();
if(response != null) {
Object[] data = (Object[]) response;
//Slightly different.. table is in data[2]\!
c.Flip table = (c.Flip) data[2];
//[…]
}
}
With the data in this form, it can be manipulated in a number of
meaningful ways. To iterate through the columns, c.n
can be called on
individual flip.y
columns in order to provide a row count:
Example: TickSubscriberExample.java
String[] columnNames = table.x;
Object[] columnData = table.y;
//Get row count for looping
int rowCount = c.n(columnData[0]);
//Print out the table!
System.out.printf("%s\t\t\t%s\t%s\t%s\n",
columnNames[0], columnNames[1], columnNames[2], columnNames[3]);
System.out.println("--------------------------------------------");
for (int i = 0; i < rowCount; i++) {
//[Printing logic]
}
This mechanism might be then enveloped in an indefinite loop, such as a
while(true)
loop. Each iteration waits on the k()
method returning
published data, which will continue until one of the contributing
processes fails (at which point an exception is caught and handled
appropriately).
Tickerplant publishing¶
Publishing data to a tickerplant is almost always a necessity for a kdb+
feed-handler process. Java, as a common language of choice for
third-party API development (e.g. Reuters, Bloomberg, MarkIT), is a popular language for feedhandler development, within which c.java
is
used to handle the asynchronous invocation of a publishing function.
Publishing rows¶
In general, publishing values to a tickerplant will require an asynchronous query much like the following:
qConnection.ks(".u.upd", "trade", data); //Where data is an Object[]
The parameters for this can be defined as follows:
- The update function name (
.u.upd
) -
This is the function executed on the tickerplant which enables the data insertion. As per the norm with this API, this is passed as a string.
- Table name
-
A String representation of the name of the table that receives the data.
- Data
-
An Object that will form the row(s) to be appended to the table. This parameter is typically passed as an object array, each index representing a table column.
In order to publish a single row to a tickerplant, typed arrays
consisting of single values might be instantiated. These are then
encapsulated in an Object array and passed to the ks
method:
Example: TickPublisherExamples.java
//Create typed arrays for holding data
String[] sym = new String[] {"IBM"};
double[] bid = new double[] {100.25};
double[] ask = new double[] {100.26};
int[] bSize = new int[]{1000};
int[] aSize = new int[]{1000};
//Create Object[] for holding typed arrays
Object[] data = new Object[] {sym, bid, ask, bSize, aSize};
//Call .u.upd asynchronously
qConnection.ks(".u.upd", "quote", data);
Publishing multiple rows is then just a case of increased length of each
of the typed arrays:
Example: TickPublisherExamples.java
String[] sym = new String[] {"IBM", "GE"};
double[] bid = new double[] {100.25, 120.25};
double[] ask = new double[] {100.26, 120.26};
int[] bSize = new int[]{1000, 2000};
int[] aSize = new int[]{1000, 2000};
In order to maximize tickerplant throughput and efficiency, it is
generally recommended to publish multiple rows in one go.
White paper Kdb+tick Profiling for Throughput Optimization.
Care has to be taken here to ensure that all typed arrays maintain
the same length, as failure to do so will likely result in a kdb+
type error. Such errors are especially troublesome when using
asynchronous methods, which will not return KExceptions
in the same
manner as sync methods! It is also worth noting that the order of the
typed arrays within the object array should match that of the table
schema.
Adding a timespan column¶
It is standard tickerplant functionality to append a timespan column to each row received from a feed handler if not included with the data passed, which is used to record when the data was received by the tickerplant. It’s possible for the publisher to create the timespan column to prevent the tickerplant from adding one:
Example: TickPublisherExamples.java
//Timespan can be added here
c.Timespan[] time = new c.Timespan[] {new c.Timespan()};
String[] sym = new String[] {"GS"};
double[] bid = new double[] {100.25};
double[] ask = new double[] {100.26};
int[] bSize = new int[]{1000};
int[] aSize = new int[]{1000};
//Timespan array is then added at beginning of Object array
Object[] data = new Object[] {time, sym, bid, ask, bSize, aSize};
qConnection.ks(".u.upd", "quote", data);
This might be done, for example, to allow the feedhandler to define the
time differently than simply logging the time at which the tickerplant
receives the data.
Connecting from kdb+ to a Java process¶
The examples thus far have emphasized interfacing between Java and kdb+
very much from the perspective of a Java client connecting to a kdb+
server, using the constructors relevant to this purpose. It is very much
possible to reverse these roles using the c(Serversocket)
constructor,
which enables a Java process to listen for incoming kdb+ messages on the
specified port.
While the use cases for this ‘server’ mode of operation are not as common as they might be for ‘client’-mode connections, it is nevertheless available to developers as a means of implementing communication between Java and kdb+ processes. The following examples demonstrate the basic mechanisms by which this can be done.
Handling a single connection¶
To set this up, a c
object is instantiated using the ‘server’ mode constructor.
This will listen to the incoming connection of a single kdb+ process:
Example: IncomingConnectionExample.java
//Wait for incoming connection
System.out.println("Waiting for incoming connection on port 5001..");
c incomingConnection = new c(new ServerSocket(5001));
In a manner similar to tickerplant subscription, the method k()
(without
parameters) can be used to wait on and listen to any connecting q
session. In this example, the object is retrieved in this fashion and
deciphered, either to return an error when passed the
symbol `returnError
or to return a message describing what was sent:
Example: IncomingConnectionExample.java
while(true) {
//k() method will wait until the kdb+ process sends an object.
Object incoming = incomingConnection.k();
try {
// check the incoming object and return something based on what it is
if (incoming instanceof String && ((String)incoming).equals("returnError")) {
incomingConnection.ke("ReturningError!");
} else if(incoming.getClass().isArray()) {
// if list, use Arrays toString method
incomingConnection.kr("The incoming list values are: " + Arrays.toString((Object[])incoming));
} else {
incomingConnection.kr(("The incoming message was: " + incoming.toString()).toCharArray());
}
} catch(IOException | KException e) {
//return error responses too
incomingConnection.ke(e.getMessage());
}
}
Handling multiple connections¶
In the above example, the server c
object is instantiated with a
new ServerSocket being created in its constructor. This is acceptable in
this instance because we cared only about the handling of one
connection.
In general, ServerSocket objects should not be used in this manner, as they are designed to handle more than a single incoming connection. Instead, the ServerSocket should be passed as a reference. With the addition of some simple threading, an application capable of handling messages from multiple q sessions can be created:
Example: IncomingConnectionsExample.java
//Create server socket reference beforehand..
ServerSocket serverSocket = new ServerSocket(5001);
//Set up connection loop
while(true) {
//Create c object with reference to server socket
final c incomingConnection = new c(serverSocket);
//Create thread for handling this connection
new Thread(new Runnable() {
@Override
public void run() {
while(true) {
//Logic in this loop is similar to single connection
//[...]
}
}
//Run thread and restart loop.
}).start();
}
This will allow any number of connections to be established, with factors
such as connection limitation and load balancing left up to how the
process is implemented. As in any case where threading is used, take
care that such a method does not enable race conditions or concurrency
issues; if necessary, steps can be taken to reduce the risk of such
operations, such as synchronized blocks and methods.
Conclusion¶
This document has covered a variety of topics concerning
the mechanics and application of the c.java
interface for kdb+. Of the
workings and examples shown, the most common use case for this interface
will be connecting to a q process, executing queries and functions and
managing any result objects. However, this document has also displayed
the versatile nature of c.java
as a tool, providing a handful of
solutions to a given problem and able to fulfill server as well as
client functions.
The practical examples should also help demonstrate that tasks required as part of a standard kdb+ toolset can be handled easily from the perspective of both Java developers interfacing with kdb+ for the first time, or kdb+ developers who are required to venture into Java development, for example, to help complete development of a feed handler. The benefit of such interfaces is felt keenly through the common role of these developers in helping to reconcile longstanding applications with contemporary technologies, often to the benefit of both.
Author¶
Peter Lyness joined First Derivatives as a software engineer in 2015. During this time he has implemented a number of Java-based technical solutions for clients, including kdb+ interface logic for upstream static and real time data feeds.