Prerequisite: The Basics
Each datatype has a number and a letter associated with it. Check out the table of primitive datatypes here. The most useful command for this section is type, which returns the number corresponding to the input’s datatype.
q)type 5 -7h
The 7 returned indicates a long, according to the primitive datatypes table, which is the default integer data type (as of version 3.0). There are various other number datatypes - these can be specified using their corresponding letter as follows.
q)type 5i -6h q)type 1 2 3h 5h q)type 5f -9h
What about the rest of the output from type? The “h” is there simply because the output from type is itself a short. The sign indicates whether the input is an atom or a list: atoms have a negative type, whereas lists have a positive type.
Strings and Symbols
Text in kdb+/q is stored as either a string or a symbol. A string is a list of characters, defined using a pair of quotation marks ". A symbol is an atomic datatype, defined using a single backtick `
q)mychar:"j" q)mystring:"johnsmith" q)type mystring 10h q)count mystring 9 q)mysymbol:`johnsmith q)type mysymbol -11h q)count mysymbol 1 q)mysymbollist:`john`smith q)count mysymbollist 2
Strings and symbols are stored differently internally. The main difference you will see now is that a symbol atom can contain any number of characters, whereas a string is really a list of chars.
Tip: use symbols for repetitive lists & strings for mostly unique ones.
Datatypes 12-19 are all temporal: e.g. time, timestamp, date.
q)type 2016.10.22 -14h q)type 2016.10.12 2014.03.30 1999.11.10 14h
As you can see, the syntax is yyyy.mm.dd. Temporal lists are separated by spaces, just like number lists. Have a look at the other temporal datatypes and see if you can create them.
Each of the temporal datatypes is stored as a number internally, making arithmetic very fast and easy.
q)2015.10.28 + 2 2015.10.30 q)12:28:30 + 7 12:28:37 q)12:28:30.000 + 7 12:28:30.007
Notice the behaviour of the different datatypes - 1 unit can be a day, a second, a millisecond…
Casting Between Datatypes
How can you get a string from a symbol? Can you add a date to a datetime? How do you create an integer from a string? All of these operations require you to cast between datatypes. In q, we use the cast operator “$” and the function “string”. Here are some examples:
q)string `johnsmith / symbol to string "johnsmith" q)string 28 / long to string "28" q)`int$3.14159 / float to integer 3i q)`$"hello" / string to symbol `hello q)`int$"28" / string to integer: returns the ASCII code for each character 50 56i
For each datatype, there is a null value. This means that “empty/unassigned” data can still have a type. Each null is in the primitive datatypes table, but here are some examples.
q)x: 1 2 3 0N 3 0N 7 q)type x 7h q)type ` -11h q)type " " -10h
1. Create a short, p. Check that its type is correct. Cast p to each of the following datatypes:
- string (list of chars)
2. Cast a string to a symbol and back again.
3. Can you figure out when the kdb+ epoch is? (e.g. Which date corresponds to 0?) What happens to earlier dates?4. What is happening here? How would we get the desired outcome of 28i?
5. Copy out the following code:
q)x:2 3 5 2 4 7 2 1i q)x=2
What datatype is returned? Can you see what is happening?