Querying Databases
This section describes how to retrieve data from your kdb Insights Enterprise deployment for analysis using the Query Builder section of the Query window in the UI.
Query builder
The Query builder is the top section of the Query window and is used to query your database.
To create a query:
-
Open the Query window by clicking + beside Queries in the left-hand menu, or using one of the other methods described here.
-
Create a query using one of the following methods:
-
Once your query is created, click Run Query.
-
View and analyze your results in the results section at the end of the screen.
-
Each query is assigned a system defined name, displayed in title bar of the query window. To save the query:
- Click Save to save the query with the system defined name. A notification is displayed indicating that the query has been saved.
To rename and save the query:
- Click the pencil icon beside the query name in the title bar and enter a new name.
- Click OK.
Saved queries are listed on the left-hand menu under Queries.
-
Click Clear to clear your query and start a new one.
Basic query
The Basic tab allows you to retrieve data from the database without having to write code. You can use it to return data within specific date/time ranges and apply filters to that data to constrain the data you receive to only what you need. The Basic tab is the first tab displayed when you open the Query window. This screen contains:
- Mandatory fields which are identified with asterix
- Additional filter options
Mandatory Fields
The following table describes the mandatory fields of the Query form.
Field name | Description |
---|---|
Table Name | The name of the table that you're querying. Select a table from the dropdown list of available tables. |
Start Date | The start date which defines the lower bound of your returned data. Use the date picker to define the date and time at which your query begins. |
End Date | The end date which defines the upper bound of your returned data. Use the date picker to define the date and time at which your query ends. |
Output Variable | This is the name you define for the variable to which this data is assigned. You can use this variable to complete further analysis in the Scratchpad using Python or q. |
Query Filtering
There are 2 levels of filtering that you can add to your query:
Row Limit
To conserve memory usage for very large datasets you can set limits on the number of rows returned by your query. By default this is set to the last 100,000 rows. You can:
- change the number of rows returned by entering a new value
- have the first rows in your database returned by clicking First
- remove the row limit filter by clicking the icon beside the filter
Additional query filters
In addition to the row limit filter, you can add the following filters.
Click Add Filter to add a filter, and Select Function to display the following set of filter types:
Combining Filters
You can apply multiple filters, as well as combine filters of different types except for select columns and define aggregations, which cannot be combined.
Filter
The Filter option is used for applying custom filtering to the query.
To apply a filter you must specify the following:
Field name | Description |
---|---|
Select Column | The column to be selected. |
Select Function | The function to be used to apply the filtering logic which is one of:in like not equal within < <= = >= > |
FX Parameter | An input provided by you which defines the item against which the column is compared. |
Filter By Label
The Filter By label option allows you to query specific tables.
To apply a Filter By label you must specify the following:
Field name | Description |
---|---|
Key | Select the key value of the label that was assigned to the database. |
Value | Select the value(s) of the label(s) to query from the tables that correspond to the label. |
If no label is selected then all corresponding tables are queried.
Select Columns
The Select Columns option provides the ability to filter the query results to display selected columns only.
To filter selected columns click on the Select Columns field and choose the column(s) from the list displayed.
- This list represents the columns in the table selected under Table Name in the mandatory fields.
- Only one Select Columns filter is allowed.
- The Define Aggregations filter cannot be combined with a Select Columns filter.
Sort
The Sort option allows you to sort the results of a returned query based on the column chosen in the Select Column field.
Define Aggregation
The Define Aggregation option allows you to apply aggregations to specified columns of the table returning the result as an additional column.
To apply an aggregation you must specify the following:
Field name | Description |
---|---|
Set Return Column | The column the results of the aggregation are returned to. |
Select Aggregation | The aggregations available to you are as follows: count last sum prd (product) min max all any var (variance) avg (average) dev (deviation) svar (sample variance) sdev (standard deviation) |
Select Column | The column the aggregation is applied to. |
The Select Columns filter cannot be combined with a Define Aggregation filter.
Group Aggregation By
The Group Aggregation By option allows you to group the results of aggregations based on the Selected Column.
Query using SQL
The SQL query option, in the Query builder, allows you to query the tables available within your deployment using SQL syntax. These queries can run in a distributed manner across all available data associated with a table.
- Enter your query in the SQL window as shown below:
-
Specify the output variable to which the return of the SQL statement is defined. This allows for further inspection of the data in Python and q.
-
Click Run Query to run the query.
- Review the results.
SQL query API
The ANSI SQL implementation used here is based on the kdb Insights SQL API.
This provides a subset of an ANSI SQL compliant API. The limitations of this are outlined here.
Query using Q
The Q query option, in the Query builder, allows you to query your data using q. The Q query functionality requires you to target a particular tier of your database for query to query data within the time window covered by the RDB, IDB or HDB respectively.
- Enter your query in the Q window as shown below:
-
Specify the output variable to which the return of the query is applied. This allows for further inspection of the data in Python and q.
-
Click Run Query to run the query.
- Review the results.
Query Results
The results of your query are displayed in the results section at the bottom of the Query window. See here for full details on the results and how they can be displayed.
Next Steps
- Create ad hoc queries using the scratchpad
- Perform further analysis and development in the scratchpad using q
- Perform further analysis and development in the scratchpad using python