BigQuery API
Google BigQuery
Google BigQuery is a fully managed, serverless data warehouse that enables scalable analysis over petabytes of data.
The KX BigQuery API lets you easily interact with the REST API Google exposes for BigQuery.
The API provides:
- An easy-to-use kdb+ interface without needing to handcraft REST API calls
- Automatic conversion of BigQuery TableSchemas in kdb+ tables
- Dynamic support for all resources and methods exposed in the BigQuery API
.com_kx_bq.
run run any BigQuery method query run query, return table queryRawResp run query and return raw response projects.list list projects in BigQuery
Discovery disc.describeMethodsForResource describe methods for a resource disc.getMethodsForResource details of methods for a resource disc.getParametersForMethod details of parameters for a method disc.listParametersForMethod list required parameters for a method disc.listResources list the available resources
Datasets datasets.get get dataset details from BigQuery datasets.insert create a dataset in a BigQuery project datasets.list list datasets of a BigQuery project
Tables tables.get get table details from BigQuery tables.insert create a BigQuery table tables.list list available BigQuery tables tabledata.insertAll insert data into a BigQuery table
Helpers fieldSchemaToKdb convert BigQuery dataset to a kdb+ object genBQFieldSchema TableFieldSchema from kdb+ table cell genBQSchema TableSchema from a kdb+ table i.applyDefaultArgs append default arguments to dict i.datasets.createInsertBody JSON-formatted body for datasets.insert i.getArgsFromURL parse parameters from patterns in a URL i.parseDataFromQuery parse kdb+ data from a query i.queries.createBody JSON-formatted body for queries API i.replaceArgsInURL replace embraced parameter patterns with values i.tabledata.createInsertAllBody JSON-formatted body for tabledata.insertAll i.tables.createInsertBody JSON-formatted body for tables.insert
Configuration cfg.baseURL base URL for the API cfg.default.datasetId default datasetId to use in the simple API cfg.default.projectId default projectId to use in the simple API cfg.default.tableId default tableId to use in the simple API cfg.discoveryDocURL URL for .com_kx_bq.run discovery document cfg.kdbModeMap map kdb+ datatypes to BigQuery modes cfg.kdbTypeMap map kdb+ datatypes to BigQuery datatypes cfg.useDefaults whether requests use defaults for omitted args
.com_kx_bq.cfg.baseURL
The base URL for the API
.com_kx_bq.cfg.default.datasetId
The default datasetId
to use in the simple API
.com_kx_bq.cfg.default.projectId
The default projectId
to use in the simple API
.com_kx_bq.cfg.default.tableId
This default tableId
to use in the simple API
.com_kx_bq.cfg.discoveryDocURL
The URL for discovery document to use for .com_kx_bq.run
.com_kx_bq.cfg.kdbModeMap
Map kdb+ datatypes to BigQuery modes
FIXME
If the signum
of the kdb+ type is
- -1, it’s negative, and the BigQuery mode is
NULLABLE
by default - 1, it’s positive, and the data is an array which corresponds to the BigQuery mode
REPEATED
.com_kx_bq.cfg.kdbTypeMap
Map kdb+ datatypes to BigQuery datatypes
FIXME
.com_kx_bq.cfg.useDefaults
Boolean: whether requests will use defaults for omitted arguments.
.com_kx_bq.datasets.get
Get dataset details from BigQuery
.com_kx_bq.datasets.get args
Where args
is a dictionary of arguments for a datasets.get
method, returns a dictionary of responses from BigQuery.
List dataset details for the dataset kx_bma_bigquery_ds
in the project cloudpak
:
q).com_kx_bq.datasets.get `projectId`datasetId!("cloudpak"; "kx_bma_bigquery_ds")
kind | "bigquery#dataset"
etag | "ijYXKX/zJYLk6Anuf++t7Q=="
id | "cloudpak:kx_bma_bigquery_ds"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
datasetReference| `datasetId`projectId!("kx_bma_bigquery_ds";"cloudpak")
access | (`role`specialGroup!("WRITER";"projectWriters");`role`speci..
creationTime | "1610712593842"
lastModifiedTime| "1610712593842"
location | "US"
.com_kx_bq.datasets.insert
Create a dataset in a BigQuery project
.com_kx_bq.datasets.insert[args;datasetId]
Where
args
is a dictionary of arguments for thedatasets.insert
method (see.com_kx_bq.disc.getParameters`datasets.insert
)datasetId
(string) is the name of the new dataset
returns the response from BigQuery as a dictionary.
Create a dataset called weather
:
q).com_kx_bq.datasets.insert[(enlist `projectId)!enlist "cloudpak"; "weather"]
kind | "bigquery#dataset"
etag | "IzXO40nvJoqo2YmhMJ0BkA=="
id | "cloudpak:weather"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
datasetReference| `datasetId`projectId!("weather";"cloudpak")
access | (`role`specialGroup!("WRITER";"projectWriters");`role`speci..
creationTime | "1611230993939"
lastModifiedTime| "1611230993939"
location | "US"
.com_kx_bq.datasets.list
List datasets of a BigQuery project
.com_kx_bq.datasets.list args
Where args
is a dictionary of arguments required for the datasets.list
(see .com_kx_bq.disc.getParametersForMethod`datasets.list
) returns as a dictionary the response from BiqQuery.
List the datasets in the project cloudpak
:
q).com_kx_bq.datasets.list enlist[`projectId]!enlist "cloudpak"
kind | "bigquery#datasetList"
etag | "QBTuIlnauLLL29CnAng5dw=="
datasets| +`kind`id`datasetReference`location!(("bigquery#dataset";"bigquery#..
.com_kx_bq.disc.describeMethodsForResource
Describe the available methods for a resource
.com_kx_bq.disc.describeMethodsForResource resource
Where resource
(symbol) is the name of a resource, returns a table of methods for it.
Describe the methods for the datasets
resource:
q).com_kx_bq.disc.describeMethodsForResource`datasets
method httpMethod parameters request response description ..
-------------------------------------------------------------------------------------------------..
update PUT `projectId`datasetId (,`$ref)!,"Dataset" (,`$ref)!,"Dataset" "Updates info..
list GET ,`projectId :: (,`$ref)!,"DatasetList" "Lists all dat..
patch PATCH `projectId`datasetId (,`$ref)!,"Dataset" (,`$ref)!,"Dataset" "Updates info..
delete DELETE `projectId`datasetId :: :: "Deletes the d..
get GET `projectId`datasetId :: (,`$ref)!,"Dataset" "Returns the d..
insert POST ,`projectId (,`$ref)!,"Dataset" (,`$ref)!,"Dataset" "Creates a new..
.com_kx_bq.disc.getMethodsForResource
Details of the available methods for a resource
.com_kx_bq.disc.getMethodsForResource resource
Where resource
(symbol) is the name of a resource, returns a dictionary of methods and their arguments.
Get methods for the tables
resource:
q).com_kx_bq.disc.getMethodsForResource`tables
getIamPolicy | `id`parameters`parameterOrder`request`description`path`sc..
list | `id`response`path`parameterOrder`httpMethod`scopes`parame..
testIamPermissions| `httpMethod`scopes`parameterOrder`flatPath`description`pa..
insert | `request`httpMethod`path`id`response`description`scopes`p..
get | `parameters`path`parameterOrder`httpMethod`id`response`sc..
patch | `scopes`parameterOrder`description`httpMethod`path`respon..
setIamPolicy | `response`parameterOrder`path`id`request`parameters`httpM..
update | `scopes`id`description`request`parameters`httpMethod`resp..
delete | `id`parameterOrder`httpMethod`description`path`scopes`par..
.com_kx_bq.disc.getParametersForMethod
Details of parameters for a method
.com_kx_bq.disc.getParametersForMethod method
Where method
(symbol) is the name of a method, in the format resource.method
returns a table of details of the parameters.
q).com_kx_bq.disc.getParametersForMethod`tables.insert
parameter type location description required format
-------------------------------------------------------------------------
datasetId "string" "path" "Dataset ID of the new table" 1 ::
projectId "string" "path" "Project ID of the new table" 1 ::
.com_kx_bq.disc.listParametersForMethod
List required parameters for a method
.com_kx_bq.disc.listParametersForMethod method
Where method
(symbol) is the name of a method in the format resource.method returns a symbol vector of required parameters.
List parameters for tables.insert
q).com_kx_bq.disc.listParametersForMethod `tables.insert
`projectId`datasetId
.com_kx_bq.disc.listResources
List the available resources
.com_kx_bq.disc.listResources[]
Returns a symbol vector of available resources.
q).com_kx_bq.disc.listResources[]
`rowAccessPolicies`models`tabledata`datasets`tables`projects`routines`jobs
.com_kx_bq.fieldSchemaToKdb
Apply a BigQuery Field Schema to a dataset to convert it into a kdb+ object
.com_kx_bq.fieldSchemaToKdb[bqFieldSchema;rowObject]
Where
bqFieldSchema
(dict) is a Schema object from BigQuery in kdb+rowObject
(dict) is either a simple value mapping or a nested field mapping
returns as a dictionary a kdb+ formatted schema object.
q)fs:`name`type`mode!("dob";"DATE";"NULLABLE") / field schema
q)ro:(enlist`v)!enlist "1980-10-16" / row object
q).com_kx_bq.fieldSchemaToKdb[fs;ro]
dob| 1980.10.16
.com_kx_bq.genBQFieldSchema
TableFieldSchema from kdb+ table cell
.com_kx_bq.genBQFieldSchema fieldDict
Where fieldDict
is a table cell as a dictionary, returns a TableFieldSchema object as a dictionary.
q)tab:([] a:1 2;b:3 4)
q)tab
a b
---
1 3
2 4
q)enlist[`a]#first tab
a| 1
q).com_kx_bq.genBQFieldSchema enlist[`a]#first tab
name| ,"a"
type| "INT64"
mode| "NULLABLE"
.com_kx_bq.genBQSchema
TableSchema from a kdb+ table
.com_kx_bq.genBQSchema table
Where table
is a kdb+ table, returns a TableSchema object as a dictionary.
This is done by inspecting the first row of the table.
q)tab:([] a:1 2;b:3 4)
q)tab
a b
---
1 3
2 4
q).com_kx_bq.genBQSchema tab
fields| +`name`type`mode!((,"a";,"b");("INT64";"INT64");("NULLABLE";"NULLABLE..
.com_kx_bq.i.applyDefaultArgs
Append default arguments to an argument dictionary if enabled
.com_kx_bq.i.applyDefaultArgs args
Where args
is a dictionary of non-default parameters for a HTTP request returns it with default values inserted.
.com_kx_bq.i.datasets.createInsertBody
JSON-formatted body for datasets.insert
API
.com_kx_bq.i.datasets.createInsertBody[projectId;datasetId]
Where
projectId
(string) is the name of a projectdatasetId
(string)is the name of a dataset to create
returns the JSON-formatted body as a string.
.com_kx_bq.i.getArgsFromURL
Parse parameters from patterns in a URL
.com_kx_bq.i.getArgsFromURL url
Where url
(string) is a URL with parameters wrapped in braces returns a list of parameters as a symbol vector.
.com_kx_bq.i.parseDataFromQuery
Parse kdb data from a query
.com_kx_bq.i.parseDataFromQuery respObj
Where respObj
(dictionary) is a response from BigQuery, returns a table of results.
.com_kx_bq.i.queries.createBody
JSON-formatted body for queries API
.com_kx_bq.i.queries.createBody query
Where query
(string) is a SQL query to run on BigQuery, returns it as a JSON string.
.com_kx_bq.i.replaceArgsInURL
Replace embraced parameter patterns with values
.com_kx_bq.i.replaceArgsInURL[url;args]
Where
url
(string) is a URL with parameters wrapped in bracesargs
(dict) is a key-value mapping of arg name to value
returns url
with the parameters replaced.
.com_kx_bq.i.tabledata.createInsertAllBody
JSON-formatted body for tabledata.insertAll
API
.com_kx_bq.i.tabledata.createInsertAllBody tableData
Where tableData
is a table of data to insert, returns it as a JSON string.
.com_kx_bq.i.tables.createInsertBody
JSON-formatted body for tables.insert
API
.com_kx_bq.i.tables.createInsertBody[projectId;datasetId;tableId;exampleKdbTab]
Where
projectId
(string)is the name of the projectdatasetId
(string)is the name of the dataset to createtableId
(string)is the name of tableexampleKdbTab
(table)is a table to generate BQ schema from
returns a JSON string for the tables.insert
API.
.com_kx_bq.projects.list
List projects in BigQuery
.com_kx_bq.projects.list[]
Returns the response from BigQuery as a dictionary.
q).com_kx_bq.projects.list[]
kind | "bigquery#projectList"
etag | "8OYfddFNRY0RuH5h5jwhew=="
projects | +`kind`id`numericId`projectReference`friendlyName!(,"bigquery#pro..
totalItems| 1f
.com_kx_bq.query
Run query
.com_kx_bq.query query
Where query
(string) is a SQL query to run on BigQuery, returns a table of results.
q).com_kx_bq.query "select * from `cloudpak.kx_bma_bigquery_ds.nested_table`"
id first_name last_name dob addresses ..
-----------------------------------------------------------------------------..
,"2" "Jane" "Doe" 1980.10.16 +`status`address`city`state`zip`numberOf..
,"1" "John" "Doe" 1968.01.22 +`status`address`city`state`zip`numberOf..
.com_kx_bq.queryRawResp
Run query and return raw response
.com_kx_bq.queryRawResp query
Where query
(string) is a SQL query to run on BigQuery, returns response as a string.
q).com_kx_bq.queryRawResp "select * from `cloudpak.kx_bma_bigquery_ds.nested_table`"
kind | "bigquery#queryResponse"
schema | (,`fields)!,(`name`type`mode!("id";"STRING";"NULLABLE");..
jobReference | `projectId`jobId`location!("cloudpak";"job_xtHZM84N-KOUf..
totalRows | ,"2"
rows | +(,`f)!,(+(,`v)!,(,"2";"Jane";"Doe";"1980-10-16";+(,`v)!..
totalBytesProcessed| ,"0"
jobComplete | 1b
cacheHit | 1b
.com_kx_bq.run
Generic API to run any supported BigQuery method
.com_kx_bq.run[func;args;body]
Where
func
(symbol) is name of the method to run in the formatresource.method
. List of resources and their methods. The version is not required in the resource, e.gdatasets.list
is valid, notv2.datasets.list
.args
(dict) is dictionary of parameters for the requestbody
(string) is JSON body for HTTP requests that require a body
returns a dictionary of results from the request.
q).com_kx_bq.run[`tables.list;`projectId`datasetId!("cloudpak";"kx_bma_bigquery_ds");()]
kind | "bigquery#tableList"
etag | "XO0VOCRg48KNOqPXpRbrFw=="
tables | +`kind`id`tableReference`type`creationTime!(("bigquery#table";"bi..
totalItems| 2f
.com_kx_bq.tabledata.insertAll
Insert data into a BigQuery table
.com_kx_bq.tabledata.insertAll[args;tabData]
Where
args
(dict) is arguments required for thetabledata.insertAll
method (see.com_kx_bq.disc.getParameters
tabledata.insertAll`)tabData
(table) is data to insert
returns the response from BigQuery as a dictionary.
q).com_kx_bq.query "select * from `cloudpak.kx_bma_bigquery_ds.simple_table`"
No rows returned from query
q)simple_table:([] longCol:1 2 3 4;dateCol:4#.z.d)
q)simple_table
longCol dateCol
------------------
1 2021.01.21
2 2021.01.21
3 2021.01.21
4 2021.01.21
q)/ argument dictionary
q)ad:`projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_ds";"simple_table")
q).com_kx_bq.tabledata.insertAll[ad; simple_table]
kind| "bigquery#tableDataInsertAllResponse"
q).com_kx_bq.query "select * from `cloudpak.kx_bma_bigquery_ds.simple_table`"
longCol dateCol
------------------
4 2021.01.21
1 2021.01.21
2 2021.01.21
3 2021.01.21
.com_kx_bq.tables.get
Get table details from BigQuery
.com_kx_bq.tables.get args
Where args
is a dictionary of parameters for the tables.get
method (see .com_kx_bq.disc.getParameters
tables.get`) returns the response from BigQuery as a dictionary.
The response doesn’t include data from the table.
Request for details of nested_table
:
q)/ parameters for nested_table
q)prms:`projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_ds";"nested_table")
q).com_kx_bq.tables.get prms
kind | "bigquery#table"
etag | "3PBOd4/GmKp8mi/GRfjCRg=="
id | "cloudpak:kx_bma_bigquery_ds.nested_table"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
tableReference | `projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_d..
schema | (,`fields)!,(`name`type`mode!("id";"STRING";"NULLABLE");`na..
numBytes | "244"
numLongTermBytes| ,"0"
numRows | ,"2"
creationTime | "1611058431773"
lastModifiedTime| "1611066081149"
type | "TABLE"
location | "US"
.com_kx_bq.tables.insert
Create a BigQuery table
.com_kx_bq.tables.insert[args;table]
Where
args
(dict) is arguments required for thetables.insert
method (see.com_kx_bq.disc.getParameters
tables.insert`)table
(table) is a kdb+ table
returns the BigQuery response as a dictionary.
This does not upload the data in table
; it creates a table in BigQuery with the same schema.
Use .com_kx_bq.tabledata.insertAll
to upload data once the schema is created.
q)simple_table:([] longCol:`long$(); dateCol:`date$())
q)args:`projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_ds";"simple_table")
q).com_kx_bq.tables.insert[args; simple_table]
kind | "bigquery#table"
etag | "ruJu2EDyJ2AihJhTwbX3kA=="
id | "cloudpak:kx_bma_bigquery_ds.simple_table"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
tableReference | `projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_d..
schema | (,`fields)!,+`name`type`mode!(("longCol";"dateCol");("INTEG..
numBytes | ,"0"
numLongTermBytes| ,"0"
numRows | ,"0"
creationTime | "1611229907510"
lastModifiedTime| "1611229907560"
type | "TABLE"
location | "US"
To create a schema with nested columns, provide a table with at least one row that contains an example of the nested structure.
q)firstName:`John`Jane
q)lastName:`Doe`Doe
q)hobbies:(`skiing`running`cycling;`running`swimming)
q)show hobbies:([] firstName;lastName;hobbies)
firstName lastName hobbies
------------------------------------------
John Doe `skiing`running`cycling
Jane Doe `running`swimming
q)args:`projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_ds";"hobbies")
q).com_kx_bq.tables.insert[args; hobbies]
kind | "bigquery#table"
etag | "m0abliNQBx3fKfFk77MUKA=="
id | "cloudpak:kx_bma_bigquery_ds.hobbies"
selfLink | "https://bigquery.googleapis.com/bigquery/v2/projects/cloud..
tableReference | `projectId`datasetId`tableId!("cloudpak";"kx_bma_bigquery_d..
schema | (,`fields)!,+`name`type`mode!(("firstName";"lastName";"hobb..
numBytes | ,"0"
numLongTermBytes| ,"0"
numRows | ,"0"
creationTime | "1611230179319"
lastModifiedTime| "1611230179364"
type | "TABLE"
location | "US"
.com_kx_bq.tables.list
List available BigQuery tables
.com_kx_bq.tables.list args
Where args
is a dictionary of arguments required for the tables.list
method (see .com_kx_bq.disc.getParameters
tables.list`) returns the response from BigQuery as a dictionary.
List tables in the project cloudpak
and dataset zkx_bma_bigquery_dsz
:
q).com_kx_bq.tables.list `projectId`datasetId!("cloudpak"; "kx_bma_bigquery_ds")
kind | "bigquery#tableList"
etag | "XO0VOCRg48KNOqPXpRbrFw=="
tables | +`kind`id`tableReference`type`creationTime!(("bigquery#table";"bi..
totalItems| 2f