Reference/wj

From Kx Wiki
(Difference between revisions)
Jump to: navigation, search
(created, based on arthur's list msg)
 
(restriction for window joins)
 
(16 intermediate revisions by 6 users not shown)
Line 1: Line 1:
= wj =
+
{{refheader|wj wj1|window join function}}
  
window join is a generalization of asof join.
+
Window join is a generalization of asof join, and is available from kdb+ 2.6. asof join takes a snapshot of the current state, while window join aggregates all values of specified columns within intervals. (Since 3.0, <tt>wj</tt> and <tt>wj1</tt> are both implemented with <tt>ww</tt>.)
asof join takes a snapshot of current state.
+
window join aggregates all values of specified columns within an interval.
+
  
e.g. regnms
+
Note that windows joins with multiple symbols should be only used with a `p#sym like schema (typical RTD-like `g# gives undefined results)
 +
Also note that only integral types are supported for the column to do the windowing on, so no reals, floats or datetimes.
 +
 
 +
== Syntax ==
  
 
<pre>
 
<pre>
/ i have taq 2003.09.10(5.6M trades and 35M quotes) on my laptop
+
  wj[w;c;t;(q;(f0;c0);(f1;c1))]
d:`:/taq/2003.09.10
+
</pre>
  
x:select sym,time,price from d`trade
+
where:
  
q:select sym,time,bid,ask from d`quote
+
*<tt>t</tt> and <tt>q</tt> are the unkeyed tables to be joined. <tt>q</tt> should be sorted <tt>`sym`time</tt> with <tt>`p#</tt> on sym
 +
*<tt>w</tt> is a pair of lists of times/timestamps, begin and end
 +
*<tt>c</tt> are the names of the two common columns, syms and times
 +
*<tt>f0</tt>,<tt>f1</tt> are aggregation functions applied to values in q columns <tt>c0</tt>,<tt>c1</tt> over the intervals
  
f:`sym`time
+
Typically. this might be:
  
/ 264,000 trades outside (bid;ask)
+
<pre>
 +
  wj[w;`sym`time;trade;(quote;(max;`ask);(min;`bid))]
 +
</pre>
  
\t a:select from aj[f;x;q]where not price within(bid;ask)
+
For each record in <tt>t</tt>, the result has a record with additional columns <tt>c0</tt> and <tt>c1</tt>, which are the results of the aggregation functions applied to values over the matching intervals in <tt>w</tt>.
  
/ 4 seconds to process 5.6M trades against 35M quotes
+
A quote is understood to be in existence until the next quote.
  
w:-3000 1000 / look at quotes from 3 seconds before to 1 second after
+
Since v3.0, wj and wj1 are both [] interval, i.e. they consider quotes>=beginning and <=end of the interval.
  
\t b:select from wj[w;f;x;(q;(max;`ask);(min;`bid))]where not price within(bid;ask)
+
For wj, the prevailing quote on entry to the window is considered valid as quotes are a step function.
  
/ 28 seconds to do all the windows.  max ask and min bid within each window
+
wj1 considers quotes on or after entry to the window. If the join is to consider quotes that arrive from the beginning of the interval, please use '''wj1'''.
  
\t c:select from wj[w;f;a;(q;(max;`ask);(min;`bid))]where not price within(bid;ask)
 
  
/ same result in 1.4seconds by using initial aj snapshot
+
Prior to v3.0, wj1 considers only quotes in the window except for the window end (i.e. quotes >= start and < end).
 +
 
 +
'''Interval behaviour by version'''
 +
{| class="wikitable"
 +
|-
 +
|'''Version'''||'''wj1'''||'''wj'''
 +
|-
 +
|3.0||[]||prevailing + []
 +
|-
 +
|2.7/2.8||[)||prevailing + []
 +
|-
 +
|}
 +
 
 +
== Example ==
 +
 
 +
<pre>
 +
q)t:([]sym:3#`ibm;time:10:01:01 10:01:04 10:01:08;price:100 101 105)
 +
q)t
 +
sym time    price
 +
------------------
 +
ibm 10:01:01 100
 +
ibm 10:01:04 101
 +
ibm 10:01:08 105
 +
 
 +
q)q:([]sym:9#`ibm;time:10:01:01+til 9;ask:101 103 103 104 104 107 108 107 108;bid:98 99 102 103 103 104 106 106 107)
 +
q)q
 +
sym time    ask bid
 +
--------------------
 +
ibm 10:01:01 101 98
 +
ibm 10:01:02 103 99
 +
ibm 10:01:03 103 102
 +
ibm 10:01:04 104 103
 +
ibm 10:01:05 104 103
 +
ibm 10:01:06 107 104
 +
ibm 10:01:07 108 106
 +
ibm 10:01:08 107 106
 +
ibm 10:01:09 108 107
 +
 
 +
q)f:`sym`time
 +
q)w:-2 1+\:t.time
 +
 
 +
q)wj[w;f;t;(q;(max;`ask);(min;`bid))]
 +
sym time    price ask bid
 +
--------------------------
 +
ibm 10:01:01 100  103 98
 +
ibm 10:01:04 101  104 99
 +
ibm 10:01:08 105  108 104
 
</pre>
 
</pre>
  
now 37,000 violations instead of 264,000
+
The interval values may be seen as:
 +
 
 +
<pre>
 +
q)wj[w;f;t;(q;(::;`ask);(::;`bid))]
 +
sym time    price ask            bid
 +
--------------------------------------------------
 +
ibm 10:01:01 100  101 103        98 99
 +
ibm 10:01:04 101  103 103 104 104 99 102 103 103
 +
ibm 10:01:08 105  107 108 107 108 104 106 106 107
 +
</pre>
  
 
== See also ==
 
== See also ==

Latest revision as of 10:23, 14 July 2016

Contents

wj wj1 (window join function)

Window join is a generalization of asof join, and is available from kdb+ 2.6. asof join takes a snapshot of the current state, while window join aggregates all values of specified columns within intervals. (Since 3.0, wj and wj1 are both implemented with ww.)

Note that windows joins with multiple symbols should be only used with a `p#sym like schema (typical RTD-like `g# gives undefined results) Also note that only integral types are supported for the column to do the windowing on, so no reals, floats or datetimes.

Syntax

  wj[w;c;t;(q;(f0;c0);(f1;c1))]

where:

Typically. this might be:

  wj[w;`sym`time;trade;(quote;(max;`ask);(min;`bid))]

For each record in t, the result has a record with additional columns c0 and c1, which are the results of the aggregation functions applied to values over the matching intervals in w.

A quote is understood to be in existence until the next quote.

Since v3.0, wj and wj1 are both [] interval, i.e. they consider quotes>=beginning and <=end of the interval.

For wj, the prevailing quote on entry to the window is considered valid as quotes are a step function.

wj1 considers quotes on or after entry to the window. If the join is to consider quotes that arrive from the beginning of the interval, please use wj1.


Prior to v3.0, wj1 considers only quotes in the window except for the window end (i.e. quotes >= start and < end).

Interval behaviour by version

Version wj1 wj
3.0 [] prevailing + []
2.7/2.8 [) prevailing + []

Example

q)t:([]sym:3#`ibm;time:10:01:01 10:01:04 10:01:08;price:100 101 105)
q)t
sym time     price
------------------
ibm 10:01:01 100
ibm 10:01:04 101
ibm 10:01:08 105

q)q:([]sym:9#`ibm;time:10:01:01+til 9;ask:101 103 103 104 104 107 108 107 108;bid:98 99 102 103 103 104 106 106 107)
q)q
sym time     ask bid
--------------------
ibm 10:01:01 101 98
ibm 10:01:02 103 99
ibm 10:01:03 103 102
ibm 10:01:04 104 103
ibm 10:01:05 104 103
ibm 10:01:06 107 104
ibm 10:01:07 108 106
ibm 10:01:08 107 106
ibm 10:01:09 108 107

q)f:`sym`time
q)w:-2 1+\:t.time

q)wj[w;f;t;(q;(max;`ask);(min;`bid))]
sym time     price ask bid
--------------------------
ibm 10:01:01 100   103 98
ibm 10:01:04 101   104 99
ibm 10:01:08 105   108 104

The interval values may be seen as:

q)wj[w;f;t;(q;(::;`ask);(::;`bid))]
sym time     price ask             bid
--------------------------------------------------
ibm 10:01:01 100   101 103         98 99
ibm 10:01:04 101   103 103 104 104 99 102 103 103
ibm 10:01:08 105   107 108 107 108 104 106 106 107

See also


For a complete list of functions, see the kdb+ Function Reference.

Personal tools
Namespaces
Variants
Actions
Navigation
Print/export
Toolbox