Choosing the Right File System for kdb+: A Case Study with KX Nano¶
The performance of a kdb+ system is critically dependent on the throughput and latency of its underlying storage. In a Linux environment, the file system is the foundational layer that enables data management on a given storage partition.
This paper presents a comparative performance analysis of various file systems using the KX Nano benchmarking utility. The evaluation was conducted across two distinct test environments, each with different operating systems and storage bandwidth (6500 vs 14000 MB/s) and IOPS (700K vs 2500K).
File systems tested:
- ext4 (rev 1)
- XFS (V5)
- Btrfs (v6.6.3, compression off)
- F2FS (v1.16.0, compression off)
- ZFS (c2.2.2, compression off)
Summary¶
No single file system demonstrated superior performance across all tested metrics; the optimal choice depends on the primary workload characteristics. The optimal choice depends on the specific operations you need to accelerate. Furthermore, the host operating system (e.g., Red Hat Enterprise Linux vs. Ubuntu) constrains the set of available and supported file systems.
Our key recommendations are as follows:
-
For write-intensive workloads where data ingestion rate is the primary driver, XFS is the recommended file system.
- XFS consistently demonstrated the highest write throughput, particularly under concurrent write scenarios. For instance, a kdb+ set operation on a large float vector (31 million elements) executed 5.6x faster on XFS than on ext4 and nearly 70x faster than on ZFS.
- This superior write performance translates to significant speedups in other I/O-heavy operations. Parallel disk sorting was 3.1x faster, and applying the
p#
(parted) attribute was 6.9x faster on XFS compared to ext4. Consequently, workloads like end-of-day (EOD) data processing will achieve the best performance with XFS.
-
For read-intensive workloads where query latency is paramount, the choice is nuanced:
- On Red Hat Enterprise Linux 9, ext4 holds a slight advantage for queries dominated by sequential reads. For random reads, its performance was comparable to XFS.
- On Ubuntu, F2FS demonstrated a performance margin in random read operations. However, this advantage shifted decisively to XFS when the data was already resident in the operating system's page cache.
kdb+ also supports tiering. For tiered data architectures (e.g., hot, mid, cold tiers), a hybrid approach is advisable.
- Hot tier: Data is frequently queried and often resides in the page cache. For this tier, a read-optimized file system like ext4 or XFS is effective.
- Mid Tier: Data is queried less often, meaning reads are more likely to come directly from storage. In this scenario, F2FS's stronger random read performance from storage provides some advantage.
- Cold Tier: Data is typically compressed and stored on high-latency, cost-effective media like HDDs or object storage. While kdb+ has built-in compression support, file systems like Btrfs, F2FS, and ZFS also offer this feature. The performance implications of file-system-level compression warrant a separate, dedicated study.
Disclaimer: These guidelines are specific to the tested hardware and workloads. We strongly encourage readers to perform their own benchmarks that reflect their specific application profiles. To facilitate this, the benchmarking suite used in this study is included with the KX Nano codebase, available on GitHub.
Details¶
All benchmarks were executed in September 2025 using kdb+ 4.1 (2025.04.28) and KX Nano 6.4.5. Each kdb+ process was configured to use 8 worker threads (-s 8
).
We used the default vector length of KX Nano, which are:
* tiny: 2047
* small: 63k
* medium: 127k
* large: 31m
* huge: 1000m
Test 1: Red Hat Enterprise Linux 9 with Intel NVMe SSD (PCIe 4.0)¶
This first test configuration utilized an Intel NVMe SSD on a server running Red Hat Enterprise Linux (RHEL) 9.3. Following RHEL 9's official supported file systems, the comparison was limited to ext4 and XFS.
Test Setup¶
Component | Specification |
---|---|
Storage | * Type: 3.84 TB Intel SSD D7-P5510 * Interface: PCIe 4.0 x4, NVMe * Sequential R/W: 6500 MB/s / 3400 MB/s * Random Read: 700K IOPS (4K) * Latency: Random Read: 82 µs (4K), Sequential Read / Write: 10 µs / 13 µs (4K) |
CPU | Intel(R) Xeon(R) 6747P (2 sockets, 48 cores per socket, 2 threads per core) |
Memory | 502GiB, DDR5 @ 6400 MT/s |
OS | RHEL 9.3 (kernel 5.14.0-362.8.1.el9_3.x86_64) |
The values presented in the result tables represent throughput in MB/s, where higher figures indicate better performance. The "Ratio" column quantifies the performance of XFS relative to ext4 (e.g., a value of 2 indicates XFS was twice as fast).
Write¶
We split the write results into two tables. The first table contains the "high-impact" tests and should be considered with more weight. These test are related to EOD (write, sort, applying attribute) and EOI (append) works that is often the bottleneck of ingestion.
Single kdb+ process:¶
XFS (MB/s) | ext4 (MB/s) | Ratio | ||
---|---|---|---|---|
Test Type | Test | |||
read mem write disk | add attribute | 259 | 231 | 1.12 |
read write disk | disk sort | 105 | 97 | 1.09 |
write disk | open append mid float, sync once | 1038 | 870 | 1.19 |
open append mid sym, sync once | 932 | 841 | 1.11 | |
write float large | 2170 | 1304 | 1.66 | |
write int huge | 3338 | 2157 | 1.55 | |
write int medium | 3070 | 1999 | 1.54 | |
write int small | 910 | 1119 | 0.81 | |
write int tiny | 100 | 50 | 2.01 | |
write sym large | 1480 | 1289 | 1.15 | |
GEOMETRIC MEAN | 776 | 605 | 1.28 | |
MAX RATIO | 3338 | 2157 | 2.01 |
Observation: XFS is almost always faster than ext4. In critical tests, the advantage is almost 28% on average with a maximal difference 101%.
The performance of the less critical write operations is below. The Linux sync
command synchronizes cached data to permanent storage. This data includes modified superblocks, modified inodes, delayed reads and writes, and others. EOD and EOI solutions often use sync
operations to improve resiliency by ensuring data is persisted to storage and not held temporarily in caches. The sync
operation is typically much faster than the set
command because Linux executes it behind the scenes (compare the speed of write float large
and sync float large
). The throughput for sync
operation is not always helpful because sync
does not necessarily need to handle the entire vector.
XFS (MB/s) | ext4 (MB/s) | Ratio | ||
---|---|---|---|---|
Test Type | Test | |||
write disk | append small, sync once | 753 | 484 | 1.55 |
append tiny, sync once | 549 | 368 | 1.49 | |
open append small, sync once | 937 | 812 | 1.15 | |
open append tiny, sync once | 200 | 210 | 0.96 | |
open replace int tiny | 261 | 263 | 0.99 | |
open replace random float large | 16 | 15 | 1.05 | |
open replace random int huge | 5 | 4 | 1.16 | |
open replace random int medium | 561 | 550 | 1.02 | |
open replace random int small | 784 | 809 | 0.97 | |
open replace sorted int huge | 5 | 5 | 1.06 | |
sync column after parted attribute | 183027 | 30812020 | 0.01 | |
sync float large | 159533 | 124762 | 1.28 | |
sync float large after replace | 158292 | 153759 | 1.03 | |
sync int huge | 82528 | 82383 | 1.00 | |
sync int huge after replace | 1148164 | 1076351 | 1.07 | |
sync int huge after sorted replace | 1151184 | 958083 | 1.20 | |
sync int medium | 44866 | 39724 | 1.13 | |
sync int small | 6890 | 6655 | 1.04 | |
sync int tiny | 232 | 221 | 1.05 | |
sync sym large | 232276 | 182916 | 1.27 | |
sync table after sort | 61306010 | 56924120 | 1.08 | |
GEOMETRIC MEAN | 5325 | 6116 | 0.87 | |
MAX RATIO | 61306010 | 56924120 | 1.55 |
64 kdb+ processes:¶
XFS (MB/s) | ext4 (MB/s) | Ratio | ||
---|---|---|---|---|
Test Type | Test | |||
read mem write disk | add attribute | 12858 | 1876 | 6.86 |
read write disk | disk sort | 2847 | 903 | 3.15 |
write disk | open append mid float, sync once | 1347 | 1368 | 0.98 |
open append mid sym, sync once | 2300 | 2118 | 1.09 | |
write float large | 62892 | 11133 | 5.65 | |
write int huge | 2455 | 2488 | 0.99 | |
write int medium | 47404 | 5879 | 8.06 | |
write int small | 28002 | 5433 | 5.15 | |
write int tiny | 2637 | 2934 | 0.90 | |
write sym large | 60629 | 17170 | 3.53 | |
GEOMETRIC MEAN | 9057 | 3420 | 2.65 | |
MAX RATIO | 62892 | 17170 | 8.06 |
Observation: The results show that XFS consistently and significantly outperformed ext4 in write-intensive operations.
In critical ingestion and EOD tasks, write throughput on XFS was on average 2.6% times higher.
This advantage peaked in specific operations, such as applying the p#
attribute and persisting a medium length integer vector, where XFS was a remarkable 7x and 8x faster than ext4.
The performance of the less critical write operations is below.
XFS (MB/s) | ext4 (MB/s) | Ratio | ||
---|---|---|---|---|
Test Type | Test | |||
write disk | append small, sync once | 1726 | 1686 | 1.02 |
append tiny, sync once | 2294 | 2120 | 1.08 | |
open append small, sync once | 1391 | 1399 | 0.99 | |
open append tiny, sync once | 2385 | 1463 | 1.63 | |
open replace int tiny | 12298 | 13634 | 0.90 | |
open replace random float large | 232 | 220 | 1.06 | |
open replace random int huge | 114 | 103 | 1.11 | |
open replace random int medium | 18188 | 18922 | 0.96 | |
open replace random int small | 28371 | 32363 | 0.88 | |
open replace sorted int huge | 59 | 60 | 0.99 | |
sync column after parted attribute | 139202 | 199845700 | 0.00 | |
sync float large | 98447 | 97428 | 1.01 | |
sync float large after replace | 192094 | 193340 | 0.99 | |
sync int huge | 230644 | 231697 | 1.00 | |
sync int huge after replace | 6272368 | 7152017 | 0.88 | |
sync int huge after sorted replace | 7883493 | 7317134 | 1.08 | |
sync int medium | 194125 | 173236 | 1.12 | |
sync int small | 132313 | 140824 | 0.94 | |
sync int tiny | 5592 | 6402 | 0.87 | |
sync sym large | 148040 | 147264 | 1.01 | |
sync table after sort | 111869100 | 373266900 | 0.30 | |
GEOMETRIC MEAN | 29819 | 43975 | 0.68 | |
MAX RATIO | 111869100 | 373266900 | 1.63 |
ext4 is faster in sync but this difference was negligible compared to the much longer write times required for sorting and applying attributes.
Read¶
We divide read tests into two categories depending on the source of the data (hot vs cold), disk vs memory (page cache).
Single kdb+ process:¶
XFS (MB/s) | ext4 (MB/s) | Ratio | ||
---|---|---|---|---|
Test Type | Test | |||
read disk | mmap, random read 1M | 597 | 590 | 1.01 |
mmap, random read 4k | 20 | 19 | 1.05 | |
mmap, random read 64k | 200 | 192 | 1.05 | |
random read 1M | 616 | 546 | 1.13 | |
random read 4k | 21 | 19 | 1.10 | |
random read 64k | 207 | 184 | 1.13 | |
sequential read binary | 689 | 681 | 1.01 | |
read disk write mem | sequential read float large | 1991 | 845 | 2.36 |
sequential read int huge | 2039 | 870 | 2.34 | |
sequential read int medium | 624 | 472 | 1.32 | |
sequential read int small | 318 | 254 | 1.25 | |
sequential read int tiny | 26 | 23 | 1.14 | |
GEOMETRIC MEAN | 259 | 205 | 1.26 | |
MAX RATIO | 2039 | 870 | 2.36 |
Observation: XFS reads the data faster from disk sequentially than ext4. Apart from this, the differences are negligible.
XFS (MB/s) | ext4 (MB/s) | Ratio | ||
---|---|---|---|---|
Test Type | Test | |||
read mem | mmap, random read 1M | 2468 | 2451 | 1.01 |
mmap, random read 4k | 261 | 263 | 0.99 | |
mmap, random read 64k | 1743 | 1704 | 1.02 | |
random read 1M | 3040 | 2997 | 1.01 | |
random read 4k | 1272 | 983 | 1.29 | |
random read 64k | 3001 | 3003 | 1.00 | |
read mem write mem | sequential read binary | 2527 | 2513 | 1.01 |
sequential reread float large | 15041 | 15229 | 0.99 | |
sequential reread int huge | 33832 | 33912 | 1.00 | |
sequential reread int medium | 8185 | 8119 | 1.01 | |
sequential reread int small | 2143 | 2070 | 1.03 | |
GEOMETRIC MEAN | 3141 | 3050 | 1.03 | |
MAX RATIO | 33832 | 33912 | 1.29 |
Observation: There is no significant performance difference between XFS and ext4 with a single kdb+ reader if the data is coming from page cache.
64 kdb+ processes:¶
XFS (MB/s) | ext4 (MB/s) | Ratio | ||
---|---|---|---|---|
Test Type | Test | |||
read disk | mmap, random read 1M | 2825 | 2821 | 1.00 |
mmap, random read 4k | 544 | 534 | 1.02 | |
mmap, random read 64k | 1075 | 1073 | 1.00 | |
random read 1M | 2793 | 2786 | 1.00 | |
random read 4k | 547 | 544 | 1.01 | |
random read 64k | 1072 | 1069 | 1.00 | |
sequential read binary | 99058 | 5114 | 19.37 | |
read disk write mem | sequential read float large | 1947 | 2825 | 0.69 |
sequential read int huge | 3123 | 3250 | 0.96 | |
sequential read int medium | 2043 | 5358 | 0.38 | |
sequential read int small | 1537 | 6036 | 0.25 | |
sequential read int tiny | 421 | 1847 | 0.23 | |
GEOMETRIC MEAN | 1896 | 2100 | 0.90 | |
MAX RATIO | 99058 | 6036 | 19.37 |
Observation: Despite the edge of XFS with a single reader, ext4 outperforms XFS sequential read if multiple kdb+ processes are reading various data in parallel. This scenario is common in a pool of HDBs where multiple concurrent queries with non-selective filters result in numerous parallel sequential reads from disk.
For random reads that require accessing the storage device directly (a cache miss), we observed no meaningful performance difference between ext4 and XFS.
XFS (MB/s) | ext4 (MB/s) | Ratio | ||
---|---|---|---|---|
Test Type | Test | |||
read mem | mmap, random read 1M | 24627 | 39646 | 0.62 |
mmap, random read 4k | 5617 | 5525 | 1.02 | |
mmap, random read 64k | 22215 | 23249 | 0.96 | |
random read 1M | 151307 | 132294 | 1.14 | |
random read 4k | 98365 | 64205 | 1.53 | |
random read 64k | 161306 | 158559 | 1.02 | |
read mem write mem | sequential read binary | 27536 | 28720 | 0.96 |
sequential reread float large | 1135453 | 1265438 | 0.90 | |
sequential reread int huge | 1459501 | 1518556 | 0.96 | |
sequential reread int medium | 568919 | 637707 | 0.89 | |
sequential reread int small | 120474 | 120112 | 1.00 | |
GEOMETRIC MEAN | 107897 | 110161 | 0.98 | |
MAX RATIO | 1459501 | 1518556 | 1.53 |
Observation: There is no clear winner in read performance if the data is coming from page cache and there are multiple readers.
Test 2: Ubuntu with Samsung NVMe SSD (PCIe 5.0)¶
Test setup¶
Component | Specification |
---|---|
Storage | * Type: 3.84 TB SAMSUNG MZWLO3T8HCLS-00A07 * Interface: PCIe 5.0 x4 * Sequential R/W: 14000 MB/s / 6000 MB/s * Random Read: 2500K IOPS (4K) |
CPU | AMD EPYC 9575F (Turin), 2 sockets, 64 cores per socket, 2 threads per core, 256 MB L3 cache, SMT off |
Memory | 2.2 TB, DDR5@6400 MT/s (12 channels per socket) |
OS | Ubuntu 24.04.3 LTS (kernel: 6.8.0-83-generic) |
Since compression is enabled by default in ZFS, we disabled it during the pool creation (-O compression=off
) to ensure a fair comparison with the other file systems.
The values presented in the result tables represent throughput ratios to XFS throughput (e.g., a value of 2 indicates XFS was twice as fast).
Write¶
Single kdb+ process:¶
ext4 | Btrfs | F2FS | ZFS | ||
---|---|---|---|---|---|
Test Type | Test | ||||
read mem write disk | add attribute | 1.1 | 1.1 | 1.0 | 1.0 |
read write disk | disk sort | 1.1 | 1.1 | 1.1 | 1.1 |
write disk | open append mid float, sync once | 1.7 | 1.6 | 1.8 | 1.6 |
open append mid sym, sync once | 1.2 | 1.2 | 1.2 | 1.0 | |
write float large | 2.8 | 1.9 | 2.7 | 0.9 | |
write int huge | 2.9 | 1.9 | 2.6 | 2.6 | |
write int medium | 2.4 | 1.8 | 2.7 | 1.0 | |
write int small | 1.3 | 4.4 | 1.1 | 1.4 | |
write int tiny | 1.2 | 0.7 | 0.8 | 1.2 | |
write sym large | 1.2 | 1.1 | 1.1 | 1.0 | |
GEOMETRIC MEAN | 1.6 | 1.5 | 1.5 | 1.2 | |
MAX RATIO | 2.9 | 4.4 | 2.7 | 2.6 |
Observation: XFS outperforms all other file systems if a single kdb+ process writes the data.
The performance of the less critical write operations is below.
ext4 | Btrfs | F2FS | ZFS | ||
---|---|---|---|---|---|
Test Type | Test | ||||
write disk | append small, sync once | 3.1 | 2.0 | 2.2 | 1.3 |
append tiny, sync once | 2.0 | 1.4 | 1.3 | 1.2 | |
open append small, sync once | 1.5 | 1.5 | 1.6 | 0.9 | |
open append tiny, sync once | 1.2 | 1.0 | 1.1 | 2.4 | |
open replace int tiny | 1.2 | 1.2 | 1.0 | 1.0 | |
open replace random float large | 19.1 | 24.7 | 17.6 | 51.6 | |
open replace random int huge | 30.0 | 40.3 | 28.5 | 99.2 | |
open replace random int medium | 0.9 | 1.4 | 0.9 | 0.8 | |
open replace random int small | 0.9 | 1.4 | 0.9 | 0.7 | |
open replace sorted int huge | 14.0 | 48.5 | 12.5 | 9.7 | |
sync float large | 1.1 | 1.4 | 1.0 | 1.2 | |
sync float large after replace | 0.9 | 1.5 | 1.2 | 6.4 | |
sync int huge | 1.3 | 1.2 | 1.0 | 0.3 | |
sync int huge after replace | 0.2 | 1.5 | 0.1 | 4.7 | |
sync int huge after sorted replace | 0.2 | 1.8 | 0.1 | 5.2 | |
sync int medium | 1.2 | 1.3 | 1.7 | 0.9 | |
sync int small | 0.8 | 1.1 | 1.2 | 1.1 | |
sync int tiny | 1.1 | 1.4 | 1.0 | 0.9 | |
sync sym large | 1.8 | 1.5 | 1.4 | 1.4 | |
sync table after sort | 1.1 | 0.7 | 3.7 | 8.5 | |
GEOMETRIC MEAN | 1.6 | 2.2 | 1.5 | 2.5 | |
MAX RATIO | 30.0 | 48.5 | 28.5 | 99.2 |
Observation: XFS significantly outperformed all other file systems if only some random part of a vector needs to be overwritten (see open replace
tests)
64 kdb+ processes:¶
ext4 | Btrfs | F2FS | ZFS | ||
---|---|---|---|---|---|
Test Type | Test | ||||
read mem write disk | add attribute | 3.0 | 2.9 | 3.2 | 3.3 |
read write disk | disk sort | 2.3 | 3.6 | 2.4 | 2.2 |
write disk | open append mid float, sync once | 1.1 | 0.8 | 2.7 | 1.7 |
open append mid sym, sync once | 1.2 | 0.9 | 2.2 | 1.6 | |
write float large | 3.1 | 2.9 | 48.4 | 69.2 | |
write int huge | 1.1 | 1.7 | 4.6 | 3.5 | |
write int medium | 3.0 | 2.7 | 45.5 | 2.8 | |
write int small | 1.3 | 4.1 | 13.7 | 1.9 | |
write int tiny | 1.5 | 10.7 | 3.5 | 5.2 | |
write sym large | 1.2 | 1.1 | 10.6 | 14.2 | |
GEOMETRIC MEAN | 1.7 | 2.4 | 6.9 | 4.2 | |
MAX RATIO | 3.1 | 10.7 | 48.4 | 69.2 |
Observation: XFS significantly outperformed all other file systems in writing. Its margin can be significant, for example, persisting a large float vector (the set
operation) is over 69 times faster on XFS than on ZFS.
The performance of the less critical write operations is below.
ext4 | Btrfs | F2FS | ZFS | ||
---|---|---|---|---|---|
Test Type | Test | ||||
write disk | append small, sync once | 1.2 | 1.1 | 3.4 | 1.3 |
append tiny, sync once | 0.7 | 1.8 | 7.1 | 1.3 | |
open append small, sync once | 1.0 | 1.0 | 3.8 | 1.6 | |
open append tiny, sync once | 1.4 | 2.0 | 19.5 | 2.6 | |
open replace int tiny | 0.9 | 47.7 | 5.6 | 2.9 | |
open replace random float large | 5.8 | 2262.0 | 196.7 | 66.0 | |
open replace random int huge | 0.0 | 0.0 | 0.0 | 0.0 | |
open replace random int medium | 0.9 | 138.8 | 7.2 | 1.2 | |
open replace random int small | 0.8 | 182.6 | 38.1 | 1.0 | |
open replace sorted int huge | 0.0 | 1.0 | 0.1 | 0.0 | |
sync float large | 1.0 | 1.0 | 0.5 | 0.5 | |
sync float large after replace | 0.5 | 1.5 | 0.1 | 1.9 | |
sync int huge | 1.0 | 0.0 | 0.6 | 0.0 | |
sync int huge after replace | 29.7 | 11.8 | 5.5 | 86.3 | |
sync int huge after sorted replace | 0.1 | 0.1 | 0.0 | 0.2 | |
sync int medium | 1.1 | 1.7 | 6.3 | 3.1 | |
sync int small | 0.9 | 1.7 | 2.5 | 2.8 | |
sync int tiny | 0.8 | 1.0 | 1.3 | 2.5 | |
sync sym large | 1.0 | 1.1 | 0.5 | 1.0 | |
sync table after sort | 1.2 | 0.9 | 1.1 | 131.5 | |
GEOMETRIC MEAN | 0.5 | 2.3 | 1.4 | 1.1 | |
MAX RATIO | 29.7 | 2262.0 | 196.7 | 131.5 |
Read¶
Single kdb+ process:¶
ext4 | Btrfs | F2FS | ZFS | ||
---|---|---|---|---|---|
Test Type | Test | ||||
read disk | mmap, random read 1M | 1.1 | 4.4 | 1.2 | 4.3 |
mmap, random read 4k | 1.0 | 1.3 | 1.1 | 1.7 | |
mmap, random read 64k | 1.0 | 6.5 | 1.0 | 2.0 | |
random read 1M | 1.1 | 4.3 | 1.2 | 4.3 | |
random read 4k | 1.0 | 1.2 | 1.1 | 1.8 | |
random read 64k | 1.0 | 6.6 | 1.0 | 2.1 | |
sequential read binary | 2.1 | 4.5 | 1.3 | 0.8 | |
read disk write mem | sequential read float large | 1.2 | 0.8 | 1.4 | 3.0 |
sequential read int huge | 1.3 | 1.0 | 1.4 | 3.0 | |
sequential read int medium | 7.3 | 4.9 | 10.1 | 14.9 | |
sequential read int small | 3.0 | 2.9 | 2.0 | 6.9 | |
sequential read int tiny | 1.7 | 1.9 | 1.2 | 3.7 | |
GEOMETRIC MEAN | 1.5 | 2.6 | 1.5 | 3.1 | |
MAX RATIO | 7.3 | 6.6 | 10.1 | 14.9 |
Observation: XFS excels in reading from disk if there is a single kdb+ reader.
ext4 | Btrfs | F2FS | ZFS | ||
---|---|---|---|---|---|
Test Type | Test | ||||
read mem | mmap, random read 1M | 1.1 | 1.1 | 1.2 | 1.1 |
mmap, random read 4k | 0.9 | 1.0 | 1.0 | 0.8 | |
mmap, random read 64k | 1.1 | 1.0 | 1.1 | 1.0 | |
random read 1M | 1.3 | 1.4 | 1.2 | 1.5 | |
random read 4k | 1.0 | 1.0 | 1.1 | 0.7 | |
random read 64k | 1.1 | 1.0 | 1.2 | 1.0 | |
read mem write mem | sequential read binary | 1.0 | 1.0 | 1.0 | 1.0 |
sequential reread float large | 1.7 | 2.4 | 2.5 | 2.2 | |
sequential reread int huge | 1.9 | 2.1 | 2.3 | 1.9 | |
sequential reread int medium | 3.6 | 3.4 | 3.7 | 2.6 | |
sequential reread int small | 0.9 | 1.0 | 0.9 | 0.9 | |
GEOMETRIC MEAN | 1.3 | 1.4 | 1.4 | 1.2 | |
MAX RATIO | 3.6 | 3.4 | 3.7 | 2.6 |
Observation: XFS excels in (sequential) reading from page cache if there is a single kdb+ reader.
64 kdb+ processes:¶
ext4 | Btrfs | F2FS | ZFS | ||
---|---|---|---|---|---|
Test Type | Test | ||||
read disk | mmap, random read 1M | 1.0 | 2.2 | 0.9 | 1.2 |
mmap, random read 4k | 1.0 | 1.1 | 0.9 | 1.6 | |
mmap, random read 64k | 0.8 | 1.5 | 0.7 | 0.9 | |
random read 1M | 1.0 | 2.2 | 0.9 | 1.2 | |
random read 4k | 1.0 | 1.0 | 0.9 | 1.6 | |
random read 64k | 0.8 | 1.5 | 0.8 | 0.9 | |
sequential read binary | 8.9 | 8.2 | 8.9 | 10.5 | |
read disk write mem | sequential read float large | 0.7 | 0.6 | 0.7 | 0.8 |
sequential read int huge | 0.9 | 0.9 | 0.9 | 1.1 | |
sequential read int medium | 1.0 | 0.6 | 1.0 | 1.5 | |
sequential read int small | 1.1 | 0.7 | 1.1 | 1.6 | |
sequential read int tiny | 1.0 | 1.6 | 1.7 | 2.8 | |
GEOMETRIC MEAN | 1.1 | 1.3 | 1.1 | 1.5 | |
MAX RATIO | 8.9 | 8.2 | 8.9 | 10.5 |
Observation: We observed that F2FS maintains a performance margin parallel disk reads from multiple kdb+ processes (e.g., an HDB pool). The sole exception was with binary reads (read1
), a pattern not typically encountered in production kdb+ environments.
ext4 | Btrfs | F2FS | ZFS | ||
---|---|---|---|---|---|
Test Type | Test | ||||
read mem | mmap, random read 1M | 1.1 | 1.1 | 1.1 | 1.1 |
mmap, random read 4k | 1.0 | 1.0 | 1.0 | 1.8 | |
mmap, random read 64k | 1.6 | 1.1 | 1.1 | 2.2 | |
random read 1M | 1.1 | 1.0 | 1.1 | 1.0 | |
random read 4k | 1.0 | 1.0 | 1.0 | 0.9 | |
random read 64k | 1.0 | 1.0 | 1.1 | 1.0 | |
read mem write mem | sequential read binary | 1.0 | 1.0 | 1.0 | 1.5 |
sequential reread float large | 1.9 | 1.9 | 57.7 | 4.8 | |
sequential reread int huge | 1.6 | 1.7 | 13.5 | 1.9 | |
sequential reread int medium | 2.0 | 2.1 | 2.0 | 7.4 | |
sequential reread int small | 1.0 | 1.0 | 1.0 | 1.7 | |
GEOMETRIC MEAN | 1.3 | 1.2 | 2.0 | 1.8 | |
MAX RATIO | 2.0 | 2.1 | 57.7 | 7.4 |
Observation: The performance advantage of F2FS vanishes entirely when data is served from the page cache. XFS is the clear winner if data is read sequentially by multiple kdb+ processes.