GCP BigQuery API reference¶
Google BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data.
The KX BigQuery API allows users to easily interact with the REST API that Google exposes for BigQuery, detailed here.
This API aims to provide some key benefits to the use:
- An easy-to-use kdb+ interface without needing to handcraft REST API calls
- Automatic conversion of BigQuery TableSchemas in kdb+ tables
- Dynamic support for all resources and methods exposed in the BigQuery API
.bq BigQuery API
Generic run run any supported BigQuery method
Discovery disc.describeMethodsForResource describe available methods for a resource disc.getMethodsForResource details of available methods for a resource disc.getParametersForMethod parameters for a method disc.listParametersForMethod required parameters for a method disc.listResources available resources
Query query run query and return result table queryRawResp run query and return raw response
Projects projects.list projects in BigQuery
Datasets datasets.get dataset details from BigQuery datasets.insert create a dataset in a BigQuery project datasets.list list datasets in a BigQuery project
Tables tables.get table metadata from BigQuery tables.insert create a BigQuery table tables.list tables available in BigQuery tabledata.insertAll insert data into a BigQuery table
Helpers fieldSchemaToKdb Apply a BigQuery Field Schema to a dataset to convert it into a kdb object genBQFieldSchema BigQuery TableFieldSchema from a table cell genBQSchema BigQuery TableSchema from a table i.applyDefaultArgs append default arguments to an dictionary i.datasets.createInsertBody JSON formatted body for datasets.insert API i.getArgsFromURL parse parameters from patterns in a URL i.parseDataFromQuery parse kdb+ data from a query i.queries.createBody JSON formatted body for queries API i.replaceArgsInURL replace parameter patterns wrapped in braces i.tabledata.createInsertAllBody JSON body for tabledata.insertAll API i.tables.createInsertBody JSON body for tables.insert API
Configuration¶
.bq.cfg.baseURL base URL for the API
.bq.cfg.default.datasetId default datasetId to use in the simple API
.bq.cfg.default.projectId default projectId to use in the simple API
.bq.cfg.default.tableId default tableId to use in the simple API
.bq.cfg.discoveryDocURL URL for discovery document to use for .bq.run
.bq.cfg.kdbModeMap maps kdb+ datatypes to BigQuery modes
.bq.cfg.kdbTypeMap maps kdb+ datatypes to BigQuery datatypes
.bq.cfg.useDefaults use default values for parameters not passed
.bq.cfg.kdbModeMap
-
Negative kdb type means the BigQuery mode is
NULLABLE
by default. -
Positive kdb type means the data is an array and corresponds to the BigQuery mode
REPEATED
.
.bq.datasets.get
¶
Dataset details from BigQuery
.bq.datasets.get args
Where args
is a dictionary of required arguments, returns the response from BigQuery as a dictionary.
Required arguments for the datasets.get
method:
q).bq.disc.listParametersForMethod `datasets.get
`projectId`datasetId
List dataset details for the dataset kx_bma_bigquery_ds
in the project cloudpak
:
q).bq.datasets.get `projectId`datasetId!("cloudpak"; "kx_bma_bigquery_ds")
kind | "bigquery#dataset"
etag | "ijYXKX/zJYLk6Anuf++t7Q=="
id | "cloudpak:kx_bma_bigquery_ds"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
datasetReference| `datasetId`projectId!("kx_bma_bigquery_ds";"cloudpak")
access | (`role`specialGroup!("WRITER";"projectWriters");`role`speci..
creationTime | "1610712593842"
lastModifiedTime| "1610712593842"
location | "US"
.bq.datasets.insert
¶
Create a dataset in a BigQuery project
.bq.datasets.insert[args;datasetId]
Where
args
(dict)is arguments required for thedatasets.insert
methoddatasetId
(string) is the name of the new dataset
returns the response from BigQuery as a dictionary.
Required arguments for the datasets.insert
method:
q).bq.disc.listParametersForMethod `datasets.insert
`projectId
Create a dataset called weather
:
q).bq.datasets.insert[enlist[`projectId]!enlist "cloudpak"; "weather"]
kind | "bigquery#dataset"
etag | "IzXO40nvJoqo2YmhMJ0BkA=="
id | "cloudpak:weather"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
datasetReference| `datasetId`projectId!("weather";"cloudpak")
access | (`role`specialGroup!("WRITER";"projectWriters");`role`speci..
creationTime | "1611230993939"
lastModifiedTime| "1611230993939"
location | "US"
.bq.datasets.list
¶
List datasets in a BigQuery project
.bq.datasets.list args
Where args
is a dictionary of arguments required for the datasets.list
, returns the response from BigQuery as a dictionary.
Required arguments for the datasets.list
method:
q).bq.disc.listParametersForMethod `datasets.list
`projectId
List the datasets in the project cloudpak
.
q).bq.datasets.list enlist[`projectId]!enlist "cloudpak"
kind | "bigquery#datasetList"
etag | "QBTuIlnauLLL29CnAng5dw=="
datasets| +`kind`id`datasetReference`location!(("bigquery#dataset";"bigquery#..
.bq.disc.describeMethodsForResource
¶
Describe available methods for a resource
.bq.disc.describeMethodsForResource resource
Where resource
(symbol) is a resource, returns a table of its methods and their descriptions.
q).bq.disc.describeMethodsForResource`datasets
method httpMethod parameters request response description ..
-------------------------------------------------------------------------------------------------..
update PUT `projectId`datasetId (,`$ref)!,"Dataset" (,`$ref)!,"Dataset" "Updates infor..
list GET ,`projectId :: (,`$ref)!,"DatasetList" "Lists all dat..
patch PATCH `projectId`datasetId (,`$ref)!,"Dataset" (,`$ref)!,"Dataset" "Updates infor..
delete DELETE `projectId`datasetId :: :: "Deletes the d..
get GET `projectId`datasetId :: (,`$ref)!,"Dataset" "Returns the d..
insert POST ,`projectId (,`$ref)!,"Dataset" (,`$ref)!,"Dataset" "Creates a new..
.bq.disc.getMethodsForResource
¶
Details of the available methods for a resource
.bq.disc.getMethodsForResource resource
Where resource
(symbol) is a resource, returns a dictionary of its methods and information about them.
q).bq.disc.getMethodsForResource[`tables]
getIamPolicy | `id`parameters`parameterOrder`request`description`path`sc..
list | `id`response`path`parameterOrder`httpMethod`scopes`parame..
testIamPermissions| `httpMethod`scopes`parameterOrder`flatPath`description`pa..
insert | `request`httpMethod`path`id`response`description`scopes`p..
get | `parameters`path`parameterOrder`httpMethod`id`response`sc..
patch | `scopes`parameterOrder`description`httpMethod`path`respon..
setIamPolicy | `response`parameterOrder`path`id`request`parameters`httpM..
update | `scopes`id`description`request`parameters`httpMethod`resp..
delete | `id`parameterOrder`httpMethod`description`path`scopes`par..
.bq.disc.getParametersForMethod
¶
Parameters for a method
.bq.disc.getParametersForMethod method
Where method
(symbol) is the name of a method, in the format resource.method
, returns a table of its parameters.
q).bq.disc.getParametersForMethod`tables.insert
parameter type location description required format
-------------------------------------------------------------------------
datasetId "string" "path" "Dataset ID of the new table" 1 ::
projectId "string" "path" "Project ID of the new table" 1 ::
.bq.disc.listParametersForMethod
¶
Required parameters for a method
.bq.disc.listParametersForMethod method
Where method
(symbol) is the name of a method, in the format resource.method
, returns its required parameters as a symbol vector.
q).bq.disc.listParametersForMethod `tables.insert
`projectId`datasetId
.bq.disc.listResources
¶
Available resources
.bq.disc.listResources[]
Returns the available sources as a symbol vector.
q).bq.disc.listResources[]
`rowAccessPolicies`models`tabledata`datasets`tables`projects`routines`jobs
.bq.fieldSchemaToKdb
¶
Apply a BigQuery Field Schema to a dataset to convert it into a kdb object
.bq.fieldSchemaToKdb[bqFieldSchema;rowObject]
Where
bqFieldSchema
is a BigQuery TableFieldSchema object as a dictionaryrowObject
is a dictionary: either- simple value mapping
- nested field mapping
returns the TableFieldSchema as a dictionary.
q)schema:`name`type`mode!("dob";"DATE";"NULLABLE")
q)ro: enlist[`v]!enlist "1980-10-16" // row object
q).bq.fieldSchemaToKdb[schema;ro]
dob| 1980.10.16
.bq.genBQFieldSchema
¶
BigQuery TableFieldSchema from a table cell
.bq.genBQFieldSchema fieldDict
Where fieldDict
is singleton dictionary of the column name and vaue of a table cell, returns a BigQuery TableFieldSchema as a dictionary.
q)tab:([] a:1 2;b:3 4)
q)tab
a b
---
1 3
2 4
q)enlist[`a]#first tab
a| 1
q).bq.genBQFieldSchema enlist[`a]#first tab
name| ,"a"
type| "INT64"
mode| "NULLABLE"
.bq.genBQSchema
¶
BigQuery TableSchema from a kdb+ table
.bq.genBQSchema table
Where table
is a table, returns its TableSchema object as a dictionary.
This is done by inspecting the first row of the table.
q)tab:([] a:1 2;b:3 4)
q)tab
a b
---
1 3
2 4
q).bq.genBQSchema tab
fields| +`name`type`mode!((,"a";,"b");("INT64";"INT64");("NULLABLE";"NULLABLE..
.bq.i.applyDefaultArgs
¶
Append default arguments to an argument dictionary
.bq.i.applyDefaultArgs args
Where args
is a dictionary of non-default parameters for a HTTP request,
returns it with defaulted parameters inserted.
.bq.i.datasets.createInsertBody
¶
JSON formatted body for datasets.insert API
.bq.i.datasets.createInsertBody[projectId;datasetId]
Where
projectId
(string)is the name of a projectdatasetId
(string) is the name of a dataset to create
returns the body of a datasets.insert
request as a JSON string.
.bq.i.getArgsFromURL
¶
Parse parameters from parameter patterns in a URL
.bq.i.getArgsFromURL url
Where url
is a string with parameters wrapped in braces, returns parameter names as a symbol vector.
.bq.i.parseDataFromQuery
¶
Parse kdb+ data from a query
.bq.i.parseDataFromQuery respObj
Where respObj
(dictionary) is a response from BigQuery, returns a table of results.
.bq.i.queries.createBody
¶
JSON formatted body for queries API
.bq.i.queries.createBody qry
Where qry
(string) is an SQL query to run on BigQuery, returns it as a JSO string.
.bq.i.replaceArgsInURL
¶
Replace parameter patterns wrapped in braces with values
.bq.i.replaceArgsInURL[url;args]
Where
url
is a URL string with parameters wrapped in bracesargs
is a dictionary of parameter names and their values
returns url
with the parameter names replaced by their values.
.bq.i.tabledata.createInsertAllBody
¶
JSON body for
tabledata.insertAllAPI
.bq.i.tabledata.createInsertAllBody tableData
Where tableData
is a table, returns it as a JSON string.
.bq.i.tables.createInsertBody
¶
JSON body for tables.insert API
.bq.i.tables.createInsertBody[projectId;datasetId;tableId;exampleKdbTab]
Where
projectId
(string)is the name of the projectdatasetId
(string) is the name of the dataset to createtableId
(string) is the name of tableexampleKdbTab
(table) is the table to generate the BigQuery schema from
returns the body of exampleKdbTab
as a JSON string.
.bq.projects.list
¶
Projects in BigQuery
.bq.projects.list[]
Returns a list of projects as a dictionary.
q).bq.projects.list[]
kind | "bigquery#projectList"
etag | "8OYfddFNRY0RuH5h5jwhew=="
projects | +`kind`id`numericId`projectReference`friendlyName!(,"bigquery#pro..
totalItems| 1f
.bq.query
¶
Run query and return result table
.bq.query query
Where query
(string) is a SQL query to run on BigQuery, returns the query results as a table.
Run a select-all query:
q).bq.query "select * from `cloudpak.kx_bma_bigquery_ds.nested_table`"
id first_name last_name dob addresses ..
-----------------------------------------------------------------------------..
,"2" "Jane" "Doe" 1980.10.16 +`status`address`city`state`zip`numberOf..
,"1" "John" "Doe" 1968.01.22 +`status`address`city`state`zip`numberOf..
.bq.queryRawResp
¶
Run query and return raw response
.bq.queryRawResp query
Where query
(string) is a SQL query to run on BigQuery, returns the query results as a dictionary.
Run a select-all query:
q).bq.queryRawResp "select * from `cloudpak.kx_bma_bigquery_ds.nested_table`"
kind | "bigquery#queryResponse"
schema | (,`fields)!,(`name`type`mode!("id";"STRING";"NULLABLE");..
jobReference | `projectId`jobId`location!("cloudpak";"job_xtHZM84N-KOUf..
totalRows | ,"2"
rows | +(,`f)!,(+(,`v)!,(,"2";"Jane";"Doe";"1980-10-16";+(,`v)!..
totalBytesProcessed| ,"0"
jobComplete | 1b
cacheHit | 1b
.bq.run
¶
Generic API to run any supported BigQuery method
.bq.run[func;args;body]
Where
func
(symbol) is the name of the method to run in the formatresource.method
args
is a dictionary of parameters for the requestbody
(JSON string) is the body for HTTP requests that require one
returns the request result as a dictionary.
Omit the version from the resource
For example, datasets.list
is valid, not v2.datasets.list
.
Run tables.list
:
q).bq.run[`tables.list;`projectId`datasetId!("cloudpak";"kx_bma_bigquery_ds");()]
kind | "bigquery#tableList"
etag | "XO0VOCRg48KNOqPXpRbrFw=="
tables | +`kind`id`tableReference`type`creationTime!(("bigquery#table";"bi..
totalItems| 2f
.bq.tabledata.insertAll
¶
Insert data into a BigQuery table
.bq.tabledata.insertAll[args;tabData]
Where
args
is a dictionary of arguments required for thetabledata.insertAll
methodtabData
is a table
inserts data from table
into the BigQuery table and returns the response as a dictionary.
q)qry:"select * from `cloudpak.kx_bma_bigquery_ds.simple_table`"
q).bq.query qry // No rows returned from query
q)show simple_table:([] longCol:1 2 3 4;dateCol:4#.z.d)
longCol dateCol
------------------
1 2021.01.21
2 2021.01.21
3 2021.01.21
4 2021.01.21
q)// Required arguments for the `datasets.get` method
q).bq.disc.listParametersForMethod `datasets.get
`projectId`datasetId`tableId
q).bq.tabledata.insertAll[`projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_ds";"simple_table"); simple_table]
kind| "bigquery#tableDataInsertAllResponse"
q).bq.query qry
longCol dateCol
------------------
4 2021.01.21
1 2021.01.21
2 2021.01.21
3 2021.01.21
.bq.tables.get
¶
Table metadata from BigQuery
.bq.tables.get args
Where args
is a dictionary of arguments required for the tables.get
method, returns a dictionary of table metadata.
Required arguments for the datasets.get
method:
q).bq.disc.listParametersForMethod `datasets.get
`projectId`datasetId`tableId
Metadata for nested_table
:
q).bq.tables.get `projectId`datasetId`tableId!("cloudpak"; "kx_bma_bigquery_ds"; "nested_table")
kind | "bigquery#table"
etag | "3PBOd4/GmKp8mi/GRfjCRg=="
id | "cloudpak:kx_bma_bigquery_ds.nested_table"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
tableReference | `projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_d..
schema | (,`fields)!,(`name`type`mode!("id";"STRING";"NULLABLE");`na..
numBytes | "244"
numLongTermBytes| ,"0"
numRows | ,"2"
creationTime | "1611058431773"
lastModifiedTime| "1611066081149"
type | "TABLE"
location | "US"
.bq.tables.insert
¶
Create a BigQuery table
.bq.tables.insert[args;table]
Where
args
is a dictionary of arguments required for thetables.insert
methodtable
is a table
creates an empty table with the same schema in BigQuery, and returns the response as a dictionary.
Creates an empty table in BigQuery
This does not upload the data in the table; it only creates a table with the same schema in BigQuery.
To upload data once the schema is created use .bq.tabledata.insertAll
.
To create a schema with nested columns, specify a table with at least one row with an example of the nested structure.
Required arguments for the datasets.insert
method:
q).bq.disc.listParametersForMethod `datasets.insert
`projectId`datasetId`tableId
Insert a simple table:
q)simple_table:([] longCol:`long$(); dateCol:`date$())
q).bq.tables.insert[`projectId`datasetId`tableId!("cloudpak"; "kx_bma_bigquery_ds";"simple_table"); simple_table]
kind | "bigquery#table"
etag | "ruJu2EDyJ2AihJhTwbX3kA=="
id | "cloudpak:kx_bma_bigquery_ds.simple_table"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
tableReference | `projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_d..
schema | (,`fields)!,+`name`type`mode!(("longCol";"dateCol");("INTEG..
numBytes | ,"0"
numLongTermBytes| ,"0"
numRows | ,"0"
creationTime | "1611229907510"
lastModifiedTime| "1611229907560"
type | "TABLE"
location | "US"
Insert a nested table:
q)hobbies:([] firstName:`John`Jane; lastName:`Doe`Doe; hobbies:(`skiing`running`cycling; `running`swimming))
q)hobbies
firstName lastName hobbies
------------------------------------------
John Doe `skiing`running`cycling
Jane Doe `running`swimming
q).bq.tables.insert[`projectId`datasetId`tableId!("cloudpak"; "kx_bma_bigquery_ds";"hobbies"); hobbies]
kind | "bigquery#table"
etag | "m0abliNQBx3fKfFk77MUKA=="
id | "cloudpak:kx_bma_bigquery_ds.hobbies"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
tableReference | `projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_d..
schema | (,`fields)!,+`name`type`mode!(("firstName";"lastName";"hobb..
numBytes | ,"0"
numLongTermBytes| ,"0"
numRows | ,"0"
creationTime | "1611230179319"
lastModifiedTime| "1611230179364"
type | "TABLE"
location | "US"
.bq.tables.list
¶
Tables available in BigQuery
.bq.tables.list args
Where args
is a dictionary of arguments required for the tables.list
method, returns the response from BigQuery as a dictionary.
Required arguments for the datasets.list
method:
q).bq.disc.listParametersForMethod `datasets.list
`projectId`datasetId
List tables in the project cloudpak
and dataset kx_bma_bigquery_ds
:
q).bq.tables.list `projectId`datasetId!("cloudpak"; "kx_bma_bigquery_ds")
kind | "bigquery#tableList"
etag | "XO0VOCRg48KNOqPXpRbrFw=="
tables | +`kind`id`tableReference`type`creationTime!(("bigquery#table";"bi..
totalItems| 2f