Storage tiers
Storage tiers describe the locality, segmentation format, and rollover configuration of each storage tier.
A storage tier has the following structure:
key | required | purpose | value & default |
---|---|---|---|
name | yes | ||
mount | yes | corresponding mounts entry which determine locality and segmentation format, and also location at which data in the tier may be accessed |
|
store | where the tier will physically store data | see below | |
inventory | object storage inventory file location | see below | |
schedule | policy for when rollovers should be considered | see below | |
retain | policy for how much data should be stored in this tier before it is rolled over into the next tier | see below | |
compression | policy for compression of data | see below |
store
-
URI describing where this tier will physically store data. If not specified, becomes
<baseURI>/data
of the correspondingmount
(enforced, even if specified, for mounts of typelocal
withpartition:ordinal
). For multiple tiers within the same mount, there can be only one tier without explicitly specifiedstore
. If specified explicitly,store
must be outside the mount'sbaseURI
. schedule
-
If present, this dictionary contains the following keys.
freq
: HH:MM:SS Used by the ordinal partition mount (IDB) to specify length of interval in each ordinal partition. Default 00:10:00.snap
: HH:MM:SS Used by the date partition mount (HDB) to specify when to move data from ordinal to date partition mount. Default 00:00:00.
snap
A snap value of 00:01:00 would allow any late data that arrives in the one minute from 00:00 -> 00:01 belonging to the previous date partition to be saved to that location. Any late data that arrives after 00:01:00 belonging to the previous date partition will be written at the next snap. The data received from 00:00 -> 00:01 belonging to the current date partition will also be saved at this time.
retain
-
This dictionary may have one or more of the following keys.
time
: A timespan consisting of a number followed by a unit: {Years
,Months
,Weeks
,Days
,Hours
,Minutes
}, e.g.2 Years
. Data which has been stored for this length of time is rolled over.sizePct
: A size as percentage of total storage of corresponding mount, specified as a number from 1 to 100.
If multiple keys are set, they are interpreted in an inclusive-OR fashion.
A
mount
partitioned asordinal
, or of typestream
cannot be used with a storage tier that has aretain
policy. compression
-
If present, this dictionary contains the following keys.
algorithm
: Compression algorithm: {none
,qipc
,gzip
,snappy
,lz4hc
}block
: Block sizelevel
: Compression level
The
compression
policy currently applies only to tiers associated with amount
oftype:local
andpartition:date
. inventory
-
If present, this dictionary contains the following keys.
enabled
: true or false to enable inventory files. If true must providelocation
(default false)location
: Location relative to the root of the bucket/storage that the inventory will be written to.
Inventory only applies when using a store that is an object storage URI.
An example configuration, which will produce
s3://kxi-example-data/inventory/inventory.tgz
is:name: hdb-s3 mount: hdb store: s3://kxi-example-data/db inventory: enabled: true location: inventory/test-db-inventory.tgz
Tiers can be categorized according to their locality and segmentation format, which imply the characteristics and governing rules:
Stream based tier
Stream based tier represents the in-memory data that is received between write-down events. It is implicit and need not be specified.
Local-ordinal based tier
There has to always be one tier that corresponds to mount of type local
with partition ordinal
. However its configuration can be omitted, in which case the frequency defaults to 10 minutes.
Local-date based tier
There can be one or more tiers that correspond to mount of type local
with partition date
. However when only one tier is used, its configuration can be omitted in which case snap-time defaults to midnight, frequency to 1 day, and retain to infinite.