Storage tiers
Storage tiers describe the locality, segmentation format, and rollover configuration of each storage tier.
A storage tier has the following structure:
| key | required | purpose | value & default |
|---|---|---|---|
| name | yes | ||
| mount | yes | corresponding mounts entry which determine locality and segmentation format, and also location at which data in the tier may be accessed |
|
| store | where the tier will physically store data | see below | |
| inventory | object storage inventory file location | see below | |
| schedule | policy for when rollovers should be considered | see below | |
| retain | policy for how much data should be stored in this tier before it is rolled over into the next tier | see below | |
| compression | policy for compression of data | see below |
store-
URI describing where this tier will physically store data. If not specified, becomes
<baseURI>/dataof the correspondingmount(enforced, even if specified, for mounts of typelocalwithpartition:ordinal). For multiple tiers within the same mount, there can be only one tier without explicitly specifiedstore. If specified explicitly,storemust be outside the mount'sbaseURI. schedule-
If present, this dictionary contains the following keys.
freq: HH:MM:SS Used by the ordinal partition mount (IDB) to specify length of interval in each ordinal partition. Default 00:10:00.snap: HH:MM:SS Used by the date partition mount (HDB) to specify when to move data from ordinal to date partition mount. Default 00:00:00.
snap
A snap value of 00:01:00 would allow any late data that arrives in the one minute from 00:00 -> 00:01 belonging to the previous date partition to be saved to that location. Any late data that arrives after 00:01:00 belonging to the previous date partition will be written at the next snap. The data received from 00:00 -> 00:01 belonging to the current date partition will also be saved at this time.
retain-
This dictionary may have one or more of the following keys.
time: A timespan consisting of a number followed by a unit: {Years,Months,Weeks,Days,Hours,Minutes}, e.g.2 Years. Data which has been stored for this length of time is rolled over.sizePct: A size as percentage of total storage of corresponding mount, specified as a number from 1 to 100.
If multiple keys are set, they are interpreted in an inclusive-OR fashion.
A
mountpartitioned asordinal, or of typestreamcannot be used with a storage tier that has aretainpolicy. compression-
If present, this dictionary contains the following keys.
algorithm: Compression algorithm: {none,qipc,gzip,snappy,lz4hc}block: Block sizelevel: Compression level
The
compressionpolicy currently applies only to tiers associated with amountoftype:localandpartition:date. inventory-
If present, this dictionary contains the following keys.
enabled: true or false to enable inventory files. If true must providelocation(default false)location: Location relative to the root of the bucket/storage that the inventory will be written to.
Inventory only applies when using a store that is an object storage URI.
An example configuration, which will produce
s3://kxi-example-data/inventory/inventory.tgzis:name: hdb-s3 mount: hdb store: s3://kxi-example-data/db inventory: enabled: true location: inventory/test-db-inventory.tgz
Tiers can be categorized according to their locality and segmentation format, which imply the characteristics and governing rules:
Stream based tier
Stream based tier represents the in-memory data that is received between write-down events. It is implicit and need not be specified.
Local-ordinal based tier
There has to always be one tier that corresponds to mount of type local with partition ordinal. However its configuration can be omitted, in which case the frequency defaults to 10 minutes.
Local-date based tier
There can be one or more tiers that correspond to mount of type local with partition date. However when only one tier is used, its configuration can be omitted in which case snap-time defaults to midnight, frequency to 1 day, and retain to infinite.