Sandbox user guide

This chapter outlines how to interact with a Sandbox from a user perspective. For details on administering and managing the Sandbox server and clients, refer to the Sandbox Admin Guide. For details of User Account Definitions and Required Privileges, refer to the relevant section from the Sandbox Install Guide.

Key functionality

Sandbox environments are intended to provide an open, flexible platform to support research activities including:

  • Write free-form q (kdb+ language) statements as well as use the in-built server-side function layer.
  • Create functions and code in memory within your own process space.
  • Create datasets and table in memory.
  • Write datasets and functions/code to disk.
  • Load datasets and code from disk.
  • Access your Sandbox file system via a command shell.
  • Start, stop and restart your Sandbox.

Users cannot:

  • Spawn multiple Sandboxes
  • Create processes
  • Directly access databases or system processes

kdb+ process structure

To ensure separation of environments between different users, each Sandbox user is assigned their own kdb+ process. Each Sandbox runs under the Sandbox user's own UNIX user account (see following section for details). This arrangement enables each user to read and write to their own disk space whilst ensuring confidentiality from other users.

Note

If you wish to share data and code from your UNIX drive with other users, contact your UNIX administrator to change the configuration to allow for this. To share in memory data or code, or access another users sandbox, contact your Sandbox server administrator.

Within your kdb+ process, you can create your own functions and data sets, store this in memory and also write them to disk for reloading at a later date. If you have FTP / SSH access to your UNIX account, you can also upload other datasets to your filesystem and then load them into your kdb+ process using standard Q load constructs.

In addition to your own data and functions, your Sandbox process also fully supports the built-in Refinery data access functions (e.g. getTicks, getStats etc.). These functions must be used to access data from KX Refinery.

UNIX environment

User accounts

Each Sandbox user requires their own UNIX user account. The UNIX administrator will configure each account and manage system access (SSH, FTP, etc), file system permissions and disk space quotas. Each user's Sandbox process runs under the user's UNIX account and will inherit read/write permissions, etc. from this account.

File systems

The UNIX administrator will use his or her standard UNIX administration tools to manage disk space quotas on the server (e.g. /etc/fstab). Management of space, data and code within each user's UNIX account and file system is the user's responsibility. The user can structure this space however they see fit.

Prerequisites

This guide assumes the following:

  • There is a user UNIX account
  • The Sandbox client has been installed under this user

For the purposes of the document, the examples will assume:

  • User = newSandboxUser1
  • Host = 123.456.78.9
  • Sandbox Client web server port = 8080
  • Sandbox port = 11211

For any sections with values in bold italics, these should be substituted for relevant details.

First log in and connecting to Sandbox

  • Navigate to 123.456.78.9:8080 (Substitute the real values here)
  • Click on Analyst
  • Enter Username and Password. The password will be password

  • On first usage, you will be prompted to change your password and relog in.
  • Once logged in, the Process options is presented
  • Select Attach
  • You will then see your Sandbox instance. Click Connect. If an error occurs at this point, refer to the Troubleshooting section.

  • The workspace browser then appears, which will be empty.

  • Create a new workspace (the user can name it whatever they choose), and then click Open

Start/stop the Sandbox

  • Navigate to 123.456.78.9:8080 (Substitute the real values here)
  • Click Control for kx
  • Log in with Username and Password
  • Then the Sandbox process can be seen:

  • You can stop the Sandbox by right clicking on the process and clicking Shut Down Q Process (within Stop )

  • Only the Sandbox administrator account will be able to use the Kill -9 Process via the UI.
    However, this may be needed if you perform an operation that locks the process indefinitely (e.g. malformed while/do loop, incorrect IPC, blocking handles, etc).
    You are able to issue kill -9 for your Sandbox through UNIX. However, it is recommended that less aggressive flags are used first (e.g. -15).

  • To start the Sandbox, click Run

Using the Sandbox

Using Analyst

You can find all information on the functionality of Analyst by clicking on the Help dropdown. Analyst contains a huge amount of functionality not described within this document such as, but not limited to: visualizations, code version management, data transformation GUI and an in-built debugger.

Using the API and q environment

Workflow examples

The examples provided here are intended to be a simple introduction and highlight some of the features of the Sandbox. However, it should be noted that you have the full power of q and kdb+ at your disposal. For additional online information on kdb+ programming, visit http://code.kx.com/wiki/Main_Page or the q tooltip within Analyst through the help dropdown:

Executing a function call

Running a Refinery function from the Sandbox is simple. Cut and paste one of the examples below into the Scratchpad and press Ctrl+D. The function will be executed and the data returned into the console display. Refer to the Server-side API Guide for details on each function.

Example calls:

/Return some trade data for BARC.L for a three hour time period on one day.
getTicks[`symList`dataType`startDate`endDate`startTime`endTime`timeZone`assetClass`temporality`filterRule!(`BARC.L;`trade;2016.05.24;2016.05.24;06:00:00.000000;09:00:00.000000;`$"Europe/London";`equity;`slice;`)]
/Return VWAP, total volume, high and low for VOD.L in hourly buckets**
getStats[`symList`dataType`startDate`endDate`startTime`endTime`timeZone`assetClass`granularity`granularityUnit`temporality`filterRule`analytics!(`BARC.L;`trade;2016.05.24;2016.05.24;09:00:00.000;17:00:00.000;`$"Europe/London";`equity;1;`hour;`slice;`default;`VWAP`sumVolume`maxPrice`minPrice)]

Writing a simple q statement

The following example shows how to express a user-defined VWAP calculation combined with a call to getTicks to retrieve some tick data to operate on:

select size wavg price by sym from getTicks[`symList`dataType`startDate`endDate`startTime`endTime`timeZone`assetClass`temporality`filterRule!(`BARC.L`PEUP.PA;`trade;2016.05.24;2016.05.24;06:00:00.000000;09:00:00.000000;`$"Europe/London";`equity;`slice;`)]

Storing data in memory

Following on from the example above, the result of the calculation can be assign to an in-memory variable called myResult as follows:

myResult: select size wavg price by sym from getTicks[`symList`dataType`startDate`endDate`startTime`endTime`timeZone`assetClass`temporality`filterRule!(`BARC.L`PEUP.PA;`trade;2016.05.24;2016.05.24;06:00:00.000000;09:00:00.000000;`$"Europe/London";`equity;`slice;`)]

Saving & loading data from disk

To save the contents of variable myResult to disk, run the following:

save `:myResult

Data can be loaded from disk as follows:

load `:myResult

Further details are available here: http://code.kx.com/wiki/Reference/save. Remember that kdb+ allows data persistence in many formats and there are operators other than save that can be used to enable persistence of splayed tables, etc.

Note

By default, the scratchpad will work on the directory as shown below:

You have access to read and write directly to disk. If the same workspace is used indefinitely, then data can be stored here. However, it is recommended that data is stored in a central, non-variable location. > The directory <install_Location>/sandbox will be created, and is intended for you to use as you wish.

Creating a function

By extending the previous VWAP example, you can create your own q functions. This time, rather than performing the calculation inline, the calculation is assigned to a function and the result of getTicks is passed to this function:

myFunction:{[data] : select VWAP:size wavg price by sym from data}

myFunction[getTicks[`symList`dataType`startDate`endDate`startTime`endTime`timeZone`assetClass`temporality`filterRule!(`BARC.L`PEUP.PA;`trade;2016.05.24;2016.05.24;06:00:00.000000;09:00:00.000000;`$"Europe/London";`equity;`slice;`)]]

Saving & loading functions

Functions in q can be written and loaded in the same way as data:

/Write the function to disk as a binary file
save`:myFunction
`:myFunction
/Show the function definition
myFunction
{[data] : select VWAP:size wavg price by sym from data}
/delete the function from the kdb+ process (it stays on disk)
delete myFunction from `.
`.
/Attempt to show the function definition to prove it's been deleted
myFunction
`myFunction
/Load the function again from disk
load `myFunction
`myFunction
/Show the function now exists in memory again
myFunction
{[data] : select VWAP:size wavg price by sym from data}

In addition to persisting individual functions as shown above, q scripts can be created, containing many function definitions / executable code in a single .q file and loaded to or from the user's process. You can store these .q files locally on your PC and load them into the sandbox for instantiation on the server, or store them server side within your UNIX account and load them there.

You can transfer files to and from the server using standard scp / ftp commands if this access has been enabled by the UNIX administrator.

Sharing code and data with other users (in memory)

You can share data and functions with other users in the organisation using connection handles between Sandbox processes. To enable users to create a connection handle to another user's Sandbox, the Sandbox administrator must first set up the user account permissions on the Sandbox being accessed to allow access to that process for the user's Refinery account.

Note

Once another user has been granted access to your Sandbox, they can access anything and everything within that Sandbox in memory or on disk.

The following example refers to the user's Sandbox as Sandbox A and the shared Sandbox as Sandbox B. For illustration purposes, assume Sandbox B is running on the same server on port 9876:

/Create a connection – h will now be the handle to box B
h:hopen`:localhost:9876:myusername:mypassword
/Push data from Sandbox A to B – set targetData on box B to be what myResult is on box A
h(set; `targetData; myResult)
/Push code from Sandbox A to B
h(set; `targetFunction; myFunction)
/Pull data from Sandbox B to A (assumes there's a variable called "foo" on Sandbox B)
myData: h"foo"
/Pull a function from Sandbox B to A (assumes there's a function called "bar" on Sandbox B)
myNewFunction: h "bar"
/Invoke a function on Sandbox B passing it data from Sandbox A. The result is returned to Sandbox A.
h(`targetFunction; myResult)

Sharing code and data with other users (on disk)

If your scripts or data is persisted to disk, other users will only be able to access it if their UNIX account has the correct privileges to access it. The owning user will either need to change the permissions on their data or ask the UNIX administrator to do so.

If required, a group of users could share a location that all have permissions for.

Troubleshooting

Issue 01 - unable to communicate with Sandbox on first connection

Symptom:

Cause:

The UNIX administrator has not added a route to the user's custom host in the /etc/hosts file.

Using the example values in this document, the line would be

123.456.78.9 <host>_sbx_newSandboxUser1_11211.<host_extension>

This is detailed in the Sandbox Install Guide.