Pivoting data with the Refinery¶
Refinery allows datasets to be pivoted on query. For this functionality to be used, the source data must be in a particular format and system configured accordingly. Pivoting allows datasets that are ingested in time | id | value format to be aggregated and filtered in the same way a wider format would.
This functionality transforms on query from the source format
| Time | ID | Value |
|---|---|---|
| 12:01:00 | a | 245 |
| 12:01:01 | b | 3.8 |
| 12:01:03 | b | 3.9 |
| 12:02:00 | a | 257 |
| 12:02:01 | b | 4.3 |
| 12:03:00 | a | 187 |
| 12:03:30 | c | 0.2 |
To the format
| Time | a | b | c |
|---|---|---|---|
| 12:01:00 | 245 | ||
| 12:01:01 | 3.8 | ||
| 12:01:03 | 3.9 | ||
| 12:02:00 | 257 | ||
| 12:02:01 | 4.3 | ||
| 12:03:00 | 187 | ||
| 12:03:30 | 0.2 |
When used with getTicks, it will return the data in the pivoted format. If used with getStats (via configuration), the underlying ticks will be pivoted before aggregation.
Pivot settings¶
The configuration parameter for pivotable datasets is .daas.cfg.pivotSettings. You must set this configuration before a dataset can use the `pivot argument in the API.
| Parameter | Type | Description | Example |
|---|---|---|---|
| tableName | symbol | Name of the underlying table (dataType) |
sensorReading |
| symCol* | symbol | The name of identifier column | sym |
| timeCol* | symbol | The name of the time column | time |
| pivotCol | symbol | The name of the value column | val |
| pivotForAggregations | boolean | Apply pivoting by default before getStats aggregations are applied | 1b |
| fillForAggregations | symbol | Filling to apply by default before getStats aggregations are applied | linear |
| pivotForGetTicks | boolean | Return getTicks queries in the pivoted format by default | 1b |
| fillForGetTicks | symbol | Filling to apply by default when returning getTicks | linear |
Note
You can leave the fill parameters blank to not apply any filtering.
Note
- symCol and timeCol must be sym and time respectively. This configuration parameter can be managed via the Refinery command line interface (CLI)
Data format¶
The source data must be stored in the three-column format of time, sym and value.
| time | sym | val |
|---|---|---|
| 12:01:00 | a | 245 |
| 12:01:01 | b | 3.8 |
| 12:01:03 | b | 3.9 |
| 12:02:00 | a | 257 |
| 12:02:01 | b | 4.3 |
| 12:03:00 | a | 187 |
| 12:03:30 | c | 0.2 |
The val header in this example can be anything, but must be configured accordingly in the settings for pivoting to be enabled.
How to set the config¶
Go to the default folder in the state-config directory and add the table config to the pivotSettings.csv file. You can package this inside your environment package alongside other deployment config.
| tableName | symCol | timeCol | pivotCol | pivotForAggregations | fillForAggregations | pivotForGetTicks | fillForGetTicks |
|---|---|---|---|---|---|---|---|
| sensorReading | sym | time | val | 1b | linear | 1b | linear |