Database Creation and Management¶

This notebook provides a walkthrough of some of the functionality available for users looking to create and maintain large databases using PyKX.

This notebook refers to creating and maintaining large partitioned kdb+ databases using PyKX. Go to Q for Mortals for more in-depth information about partitioned databases in kdb+.

You can download this walkthrough as a .ipynb notebook file.

This walkthrough provides examples of the following tasks:

Creating a database from a historical dataset
Adding a new partition to the database
Managing the on-disk database by:
- Renaming a table and column
- Creating a copy of a column to the database
- Applying a Python function to a column of the database
- Updating the data type of a column
Adding a new table to the most recent partition of the database

For full information on the functions available, go to the API section.

Initial setup¶

Import all required libraries and create a temporary directory which will be used to store the database we create for this walkthrough.

In [2]:

Copied!





import os
os.environ['PYKX_BETA_FEATURES'] = 'true'

import pykx as kx
from datetime import date
import tempfile
import os
os.environ['PYKX_BETA_FEATURES'] = 'true'

import pykx as kx
from datetime import date
import tempfile

In [3]:

Copied!

tempdir = tempfile.TemporaryDirectory()
tempdir = tempfile.TemporaryDirectory()

Database interactions are facilitated through use of the pykx.DB class. All methods/attributes used in this notebook are contained within this class.

Initialise the DB class to start. The expected input is the file path where you intend to save the partitioned database and its associated tables. In this case we're going to use the temporary directory we just created.

In [4]:

Copied!

db = kx.DB(path = tempdir.name + '/db')
db = kx.DB(path = tempdir.name + '/db')

For details on any methods contained within this class, use the help method.

In [5]:

Copied!

help(db.create)
help(db.create)

Help on method create in module pykx.db:

create(table: pykx.wrappers.Table, table_name: str, partition: Union[int, str, pykx.wrappers.DateAtom], *, by_field: Optional[str] = None, sym_enum: Optional[str] = None, log: Optional[bool] = True, compress: Optional[pykx.compress_encrypt.Compress] = None, encrypt: Optional[pykx.compress_encrypt.Encrypt] = None, change_dir: Optional[bool] = True, load_scripts: Optional[bool] = True) -> None method of pykx.db.DB instance
    Create an on-disk partitioned table within a kdb+ database from a supplied
        `#!python pykx.Table` object. Once generated this table will be accessible
        as an attribute of the `#!python DB` class or a sub attribute of `#!python DB.table`.
    
    Parameters:
        table: The `#!python pykx.Table` object which is to be persisted to disk
        table_name: The name with which the table will be persisted and accessible
            once loaded and available as a `#!python pykx.PartitionedTable`
        partition: The name of the column which is to be used to partition the data if
            supplied as a `#!python str` or if supplied as non string object this is
            used as the partition to which all data is persisted.
        by_field: A field of the table to be used as a by column, this column will be
            the second column in the table (the first being the virtual column determined
            by the partitioning column)
        sym_enum: The name of the symbol enumeration table to be associated with the table
        log: Print information about status while persisting the partitioned database
        compress: `#!python pykx.Compress` initialized class denoting the compression settings
            to be used when persisting a partition/partitions
        encrypt: `#!python pykx.Encrypt` initialized class denoting the encryption setting
            to be used when persisting a partition/partitions
    
    
    Returns:
        A `#!python None` object on successful invocation, the database class is
            updated to contain attributes associated with the available created table
    
    Examples:
    
    Generate a partitioned database from a table containing multiple partitions.
    
    ```python
    >>> import pykx as kx
    >>> db = kx.DB(path = '/tmp/newDB')
    >>> N = 1000
    >>> qtab = kx.Table(data = {
    ...     'date': kx.q.asc(kx.random.random(N, kx.q('2020.01 2020.02 2020.03m'))),
    ...     'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
    ...     'price': kx.random.random(N, 10.0),
    ...     'size': kx.random.random(N, 100)
    ... })
    >>> db.create(qtab, 'stocks', 'date', by_field = 'sym', sym_enum = 'symbols')
    >>> db.tables
    ['stocks']
    >>> db.stocks
    pykx.PartitionedTable(pykx.q('
    month   sym  price     size
    ---------------------------
    2020.01 AAPL 7.979004  85
    2020.01 AAPL 5.931866  55
    2020.01 AAPL 5.255477  49
    2020.01 AAPL 8.15255   74
    2020.01 AAPL 4.771067  80
    ..
    '))
    ```
    
    Add a table as a partition to an on-disk database.
    
    ```python
    >>> import pykx as kx
    >>> db = kx.DB(path = '/tmp/newDB')
    >>> N = 333
    >>> qtab = kx.Table(data = {
    ...     'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
    ...     'price': kx.random.random(N, 10.0),
    ...     'size': kx.random.random(N, 100)
    ... })
    >>> db.create(qtab, 'stocks', kx.q('2020.04m'), by_field = 'sym', sym_enum = 'symbols')
    >>> db.tables
    ['stocks']
    >>> db.stocks
    pykx.PartitionedTable(pykx.q('
    month   sym  price     size
    ---------------------------
    2020.01 AAPL 7.979004  85
    2020.01 AAPL 5.931866  55
    2020.01 AAPL 5.255477  49
    2020.01 AAPL 8.15255   74
    2020.01 AAPL 4.771067  80
    ..
    '))
    ```
    
    Add a table as a partition to an on-disk database and apply gzip
        compression to the persisted table
    
    ```python
    >>> import pykx as kx
    >>> db = kx.DB(path = '/tmp/newDB')
    >>> N = 333
    >>> qtab = kx.Table(data = {
    ...     'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
    ...     'price': kx.random.random(N, 10.0),
    ...     'size': kx.random.random(N, 100)
    ... })
    >>> compress = kx.Compress(kx.CompressionAlgorithm.gzip, level=2)
    >>> db.create(qtab, 'stocks', kx.q('2020.04m'), compress=compress)
    >>> kx.q('{-21!hsym x}', '/tmp/newDB/2020.04/stocks/price')
    pykx.Dictionary(pykx.q('
    compressedLength  | 2064
    uncompressedLength| 2680
    algorithm         | 2i
    logicalBlockSize  | 17i
    zipLevel          | 2i
    '))
    ```

Create the sample dataset¶

Create a dataset called trades containing time-series data spanning multiple dates, and columns of various types:

In [6]:

Copied!





N = 1000000
trades = kx.Table(data={
     'date': kx.random.random(N, [date(2020, 1, 1), date(2020, 1, 2)]),
     'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
     'price': kx.random.random(N, 10.0),
     'size': kx.random.random(N, 1000)
})
N = 1000000
trades = kx.Table(data={
     'date': kx.random.random(N, [date(2020, 1, 1), date(2020, 1, 2)]),
     'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
     'price': kx.random.random(N, 10.0),
     'size': kx.random.random(N, 1000)
})

Create the database¶

Create the database using the date column as the partition, and add trades as a table called trade_data within it.

In [7]:

Copied!

db.create(trades, 'trade_data', 'date')
db.create(trades, 'trade_data', 'date')

Writing Database Partition 2020.01.01 to table trade_data
Writing Database Partition 2020.01.02 to table trade_data

This now exists as a table and is saved to disk.

In [8]:

Copied!

db.tables
db.tables

Out[8]:

['trade_data']

When a table is saved, an attribute is added to the db class for it. For our newly generated table, this is db.trade_data.

In [9]:

Copied!

db.trade_data
db.trade_data

Out[9]:

	date	sym	price	size

0	2020.01.01	MSFT	7.079266	800
1	2020.01.01	AAPL	1.824321	65
2	2020.01.01	MSFT	2.408259	292
3	2020.01.01	GOOG	1.675438	7
4	2020.01.01	AAPL	8.311168	183
5	2020.01.01	AAPL	2.208693	989
6	2020.01.01	MSFT	6.068126	567
7	2020.01.01	AAPL	4.918926	794
8	2020.01.01	AAPL	9.331869	39
9	2020.01.01	AAPL	1.142611	507
10	2020.01.01	AAPL	2.685874	581
11	2020.01.01	AAPL	3.483591	163
12	2020.01.01	AAPL	0.4422525	466
13	2020.01.01	MSFT	7.406654	976
14	2020.01.01	MSFT	2.493871	171
15	2020.01.01	AAPL	9.242088	28
16	2020.01.01	MSFT	0.3954522	747
17	2020.01.01	MSFT	0.3441191	512
18	2020.01.01	GOOG	9.662762	998
19	2020.01.01	AAPL	9.601674	812
20	2020.01.01	AAPL	4.969858	910
21	2020.01.01	GOOG	1.048204	830
22	2020.01.01	GOOG	0.9817644	595
23	2020.01.01	AAPL	4.925185	976
...	...	...	...	...
999999	2020.01.02	GOOG	1.470716	636

1,000,000 rows × 4 columns

Add a new partition to the database¶

Once a table has been generated, you can add more partitions to the database through reuse of the create method. In this case we are adding the new partition 2020.01.03 to the database.

In [10]:

Copied!





N = 10000
new_day = kx.Table(data={
    'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
    'price': kx.random.random(N, 10.0),
    'size': kx.random.random(N, 100)
})
db.create(new_day, 'trade_data', date(2020, 1, 3))
N = 10000
new_day = kx.Table(data={
    'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
    'price': kx.random.random(N, 10.0),
    'size': kx.random.random(N, 100)
})
db.create(new_day, 'trade_data', date(2020, 1, 3))

Writing Database Partition 2020-01-03 to table trade_data

Manage the database¶

This section covers updating the contents of a database. The examples below demonstrate a number of common tasks that would be completed regularly when updating a database.

The name of a table can be updated using the rename_table method. Below, we are updating the table trade_data to be called trade.

In [11]:

Copied!

db.rename_table('trade_data', 'trades')
db.rename_table('trade_data', 'trades')

2025.01.29 11:10:12 renaming :/tmp/tmp8o8bs72d/db/2020.01.01/trade_data to :/tmp/tmp8o8bs72d/db/2020.01.01/trades
2025.01.29 11:10:12 renaming :/tmp/tmp8o8bs72d/db/2020.01.02/trade_data to :/tmp/tmp8o8bs72d/db/2020.01.02/trades
2025.01.29 11:10:12 renaming :/tmp/tmp8o8bs72d/db/2020.01.03/trade_data to :/tmp/tmp8o8bs72d/db/2020.01.03/trades

During the rename process, the attribute in the db class is also updated.

In [12]:

Copied!

db.trades
db.trades

Out[12]:

	date	sym	price	size

0	2020.01.01	MSFT	7.079266	800
1	2020.01.01	AAPL	1.824321	65
2	2020.01.01	MSFT	2.408259	292
3	2020.01.01	GOOG	1.675438	7
4	2020.01.01	AAPL	8.311168	183
5	2020.01.01	AAPL	2.208693	989
6	2020.01.01	MSFT	6.068126	567
7	2020.01.01	AAPL	4.918926	794
8	2020.01.01	AAPL	9.331869	39
9	2020.01.01	AAPL	1.142611	507
10	2020.01.01	AAPL	2.685874	581
11	2020.01.01	AAPL	3.483591	163
12	2020.01.01	AAPL	0.4422525	466
13	2020.01.01	MSFT	7.406654	976
14	2020.01.01	MSFT	2.493871	171
15	2020.01.01	AAPL	9.242088	28
16	2020.01.01	MSFT	0.3954522	747
17	2020.01.01	MSFT	0.3441191	512
18	2020.01.01	GOOG	9.662762	998
19	2020.01.01	AAPL	9.601674	812
20	2020.01.01	AAPL	4.969858	910
21	2020.01.01	GOOG	1.048204	830
22	2020.01.01	GOOG	0.9817644	595
23	2020.01.01	AAPL	4.925185	976
...	...	...	...	...
1009999	2020.01.03	AAPL	9.750387	99

1,010,000 rows × 4 columns

To rename a column in a table, use the rename_column method. For example, let's rename the sym column (in the trade table) to ticker.

In [13]:

Copied!

db.rename_column('trades', 'sym', 'ticker')
db.rename_column('trades', 'sym', 'ticker')

2025.01.29 11:10:12 renaming sym to ticker in `:/tmp/tmp8o8bs72d/db/2020.01.01/trades
2025.01.29 11:10:12 renaming sym to ticker in `:/tmp/tmp8o8bs72d/db/2020.01.02/trades
2025.01.29 11:10:12 renaming sym to ticker in `:/tmp/tmp8o8bs72d/db/2020.01.03/trades

In [14]:

Copied!

db.trades
db.trades

Out[14]:

	date	ticker	price	size

0	2020.01.01	MSFT	7.079266	800
1	2020.01.01	AAPL	1.824321	65
2	2020.01.01	MSFT	2.408259	292
3	2020.01.01	GOOG	1.675438	7
4	2020.01.01	AAPL	8.311168	183
5	2020.01.01	AAPL	2.208693	989
6	2020.01.01	MSFT	6.068126	567
7	2020.01.01	AAPL	4.918926	794
8	2020.01.01	AAPL	9.331869	39
9	2020.01.01	AAPL	1.142611	507
10	2020.01.01	AAPL	2.685874	581
11	2020.01.01	AAPL	3.483591	163
12	2020.01.01	AAPL	0.4422525	466
13	2020.01.01	MSFT	7.406654	976
14	2020.01.01	MSFT	2.493871	171
15	2020.01.01	AAPL	9.242088	28
16	2020.01.01	MSFT	0.3954522	747
17	2020.01.01	MSFT	0.3441191	512
18	2020.01.01	GOOG	9.662762	998
19	2020.01.01	AAPL	9.601674	812
20	2020.01.01	AAPL	4.969858	910
21	2020.01.01	GOOG	1.048204	830
22	2020.01.01	GOOG	0.9817644	595
23	2020.01.01	AAPL	4.925185	976
...	...	...	...	...
1009999	2020.01.03	AAPL	9.750387	99

1,010,000 rows × 4 columns

To safely apply a function to modify the price column within the database, first create a copy of the column.

In [15]:

Copied!

db.copy_column('trades', 'price', 'price_copy')
db.copy_column('trades', 'price', 'price_copy')

2025.01.29 11:10:12 copying price to price_copy in `:/tmp/tmp8o8bs72d/db/2020.01.01/trades
2025.01.29 11:10:12 copying price to price_copy in `:/tmp/tmp8o8bs72d/db/2020.01.02/trades
2025.01.29 11:10:12 copying price to price_copy in `:/tmp/tmp8o8bs72d/db/2020.01.03/trades

In [16]:

Copied!

db.trades
db.trades

Out[16]:

	date	ticker	price	size	price_copy

0	2020.01.01	MSFT	7.079266	800	7.079266
1	2020.01.01	AAPL	1.824321	65	1.824321
2	2020.01.01	MSFT	2.408259	292	2.408259
3	2020.01.01	GOOG	1.675438	7	1.675438
4	2020.01.01	AAPL	8.311168	183	8.311168
5	2020.01.01	AAPL	2.208693	989	2.208693
6	2020.01.01	MSFT	6.068126	567	6.068126
7	2020.01.01	AAPL	4.918926	794	4.918926
8	2020.01.01	AAPL	9.331869	39	9.331869
9	2020.01.01	AAPL	1.142611	507	1.142611
10	2020.01.01	AAPL	2.685874	581	2.685874
11	2020.01.01	AAPL	3.483591	163	3.483591
12	2020.01.01	AAPL	0.4422525	466	0.4422525
13	2020.01.01	MSFT	7.406654	976	7.406654
14	2020.01.01	MSFT	2.493871	171	2.493871
15	2020.01.01	AAPL	9.242088	28	9.242088
16	2020.01.01	MSFT	0.3954522	747	0.3954522
17	2020.01.01	MSFT	0.3441191	512	0.3441191
18	2020.01.01	GOOG	9.662762	998	9.662762
19	2020.01.01	AAPL	9.601674	812	9.601674
20	2020.01.01	AAPL	4.969858	910	4.969858
21	2020.01.01	GOOG	1.048204	830	1.048204
22	2020.01.01	GOOG	0.9817644	595	0.9817644
23	2020.01.01	AAPL	4.925185	976	4.925185
...	...	...	...	...	...
1009999	2020.01.03	AAPL	9.750387	99	9.750387

1,010,000 rows × 5 columns

You can now apply a function to the copied column without the risk of losing the original data. Below, let's modify the copied column by multiplying the contents by 2.

In [17]:

Copied!

db.apply_function('trades', 'price_copy', kx.q('{2*x}'))
db.apply_function('trades', 'price_copy', kx.q('{2*x}'))

2025.01.29 11:10:12 resaving column price_copy (type 9) in `:/tmp/tmp8o8bs72d/db/2020.01.01/trades

2025.01.29 11:10:12 resaving column price_copy (type 9) in `:/tmp/tmp8o8bs72d/db/2020.01.02/trades
2025.01.29 11:10:12 resaving column price_copy (type 9) in `:/tmp/tmp8o8bs72d/db/2020.01.03/trades

In [18]:

Copied!

db.trades
db.trades

Out[18]:

	date	ticker	price	size	price_copy

0	2020.01.01	MSFT	7.079266	800	14.15853
1	2020.01.01	AAPL	1.824321	65	3.648642
2	2020.01.01	MSFT	2.408259	292	4.816519
3	2020.01.01	GOOG	1.675438	7	3.350875
4	2020.01.01	AAPL	8.311168	183	16.62234
5	2020.01.01	AAPL	2.208693	989	4.417385
6	2020.01.01	MSFT	6.068126	567	12.13625
7	2020.01.01	AAPL	4.918926	794	9.837851
8	2020.01.01	AAPL	9.331869	39	18.66374
9	2020.01.01	AAPL	1.142611	507	2.285222
10	2020.01.01	AAPL	2.685874	581	5.371748
11	2020.01.01	AAPL	3.483591	163	6.967183
12	2020.01.01	AAPL	0.4422525	466	0.8845049
13	2020.01.01	MSFT	7.406654	976	14.81331
14	2020.01.01	MSFT	2.493871	171	4.987742
15	2020.01.01	AAPL	9.242088	28	18.48418
16	2020.01.01	MSFT	0.3954522	747	0.7909045
17	2020.01.01	MSFT	0.3441191	512	0.6882382
18	2020.01.01	GOOG	9.662762	998	19.32552
19	2020.01.01	AAPL	9.601674	812	19.20335
20	2020.01.01	AAPL	4.969858	910	9.939716
21	2020.01.01	GOOG	1.048204	830	2.096408
22	2020.01.01	GOOG	0.9817644	595	1.963529
23	2020.01.01	AAPL	4.925185	976	9.850371
...	...	...	...	...	...
1009999	2020.01.03	AAPL	9.750387	99	19.50077

1,010,000 rows × 5 columns

Once you are happy with the new values within the price_copy column, you can safely delete the price column, then rename the price_copy column to be called price.

In [19]:

Copied!

db.delete_column('trades', 'price')
db.rename_column('trades', 'price_copy', 'price')
db.delete_column('trades', 'price')
db.rename_column('trades', 'price_copy', 'price')

2025.01.29 11:10:12 deleting column price from `:/tmp/tmp8o8bs72d/db/2020.01.01/trades

2025.01.29 11:10:12 deleting column price from `:/tmp/tmp8o8bs72d/db/2020.01.02/trades
2025.01.29 11:10:12 deleting column price from `:/tmp/tmp8o8bs72d/db/2020.01.03/trades
2025.01.29 11:10:12 renaming price_copy to price in `:/tmp/tmp8o8bs72d/db/2020.01.01/trades
2025.01.29 11:10:12 renaming price_copy to price in `:/tmp/tmp8o8bs72d/db/2020.01.02/trades
2025.01.29 11:10:12 renaming price_copy to price in `:/tmp/tmp8o8bs72d/db/2020.01.03/trades

In [20]:

Copied!

db.trades
db.trades

Out[20]:

	date	ticker	size	price

0	2020.01.01	MSFT	800	14.15853
1	2020.01.01	AAPL	65	3.648642
2	2020.01.01	MSFT	292	4.816519
3	2020.01.01	GOOG	7	3.350875
4	2020.01.01	AAPL	183	16.62234
5	2020.01.01	AAPL	989	4.417385
6	2020.01.01	MSFT	567	12.13625
7	2020.01.01	AAPL	794	9.837851
8	2020.01.01	AAPL	39	18.66374
9	2020.01.01	AAPL	507	2.285222
10	2020.01.01	AAPL	581	5.371748
11	2020.01.01	AAPL	163	6.967183
12	2020.01.01	AAPL	466	0.8845049
13	2020.01.01	MSFT	976	14.81331
14	2020.01.01	MSFT	171	4.987742
15	2020.01.01	AAPL	28	18.48418
16	2020.01.01	MSFT	747	0.7909045
17	2020.01.01	MSFT	512	0.6882382
18	2020.01.01	GOOG	998	19.32552
19	2020.01.01	AAPL	812	19.20335
20	2020.01.01	AAPL	910	9.939716
21	2020.01.01	GOOG	830	2.096408
22	2020.01.01	GOOG	595	1.963529
23	2020.01.01	AAPL	976	9.850371
...	...	...	...	...
1009999	2020.01.03	AAPL	99	19.50077

1,010,000 rows × 4 columns

To convert the data type of a column, use the set_column_type method. Before we do that, let's look at the metadata information for the table using the meta method:

In [21]:

Copied!

kx.q.meta(db.trades)
kx.q.meta(db.trades)

Out[21]:

	t	f	a
c
date	"d"
ticker	"s"
size	"j"
price	"f"

Currently the size column is the type LongAtom. Let's update this to be a type ShortAtom:

In [22]:

Copied!

db.set_column_type('trades', 'size', kx.ShortAtom)
db.set_column_type('trades', 'size', kx.ShortAtom)

2025.01.29 11:10:12 resaving column size (type 5) in `:/tmp/tmp8o8bs72d/db/2020.01.01/trades
2025.01.29 11:10:12 resaving column size (type 5) in `:/tmp/tmp8o8bs72d/db/2020.01.02/trades
2025.01.29 11:10:12 resaving column size (type 5) in `:/tmp/tmp8o8bs72d/db/2020.01.03/trades

Now let's apply the grouped attribute to the size column. For more information on attributes in kdb+, refer to the Q for Mortals Attributes section.

In [23]:

Copied!

db.set_column_attribute('trades', 'ticker', 'grouped')
db.set_column_attribute('trades', 'ticker', 'grouped')

2025.01.29 11:10:12 resaving column ticker (type 20) in `:/tmp/tmp8o8bs72d/db/2020.01.01/trades
2025.01.29 11:10:12 resaving column ticker (type 20) in `:/tmp/tmp8o8bs72d/db/2020.01.02/trades
2025.01.29 11:10:12 resaving column ticker (type 20) in `:/tmp/tmp8o8bs72d/db/2020.01.03/trades

Let's revisit the metadata of the table to ensure they have been applied correctly.

In [24]:

Copied!

kx.q.meta(db.trades)
kx.q.meta(db.trades)

Out[24]:

	t	a
c
date	"d"
ticker	"s"	g
size	"h"
price	"f"

Onboard your next table¶

Now that you have successfully set up one table, you may want to add a second table. We follow the same method as before and create the quotes table using the create method. In this example, the quotes table only contains data for 2020.01.03:

In [25]:

Copied!





quotes = kx.Table(data={
    'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
    'open': kx.random.random(N, 10.0),
    'high': kx.random.random(N, 10.0),
    'low': kx.random.random(N, 10.0),
    'close': kx.random.random(N, 10.0)
})
quotes = kx.Table(data={
    'sym': kx.random.random(N, ['AAPL', 'GOOG', 'MSFT']),
    'open': kx.random.random(N, 10.0),
    'high': kx.random.random(N, 10.0),
    'low': kx.random.random(N, 10.0),
    'close': kx.random.random(N, 10.0)
})

In [26]:

Copied!

db.create(quotes, 'quotes', date(2020, 1, 3), by_field = 'sym')
db.create(quotes, 'quotes', date(2020, 1, 3), by_field = 'sym')

Writing Database Partition 2020-01-03 to table quotes

All tables within a database must contain the same partition structure. To ensure you can access the new table, the quotes table needs to exist in every partition within the database, even if there is no data for that partition. This is called backfilling data. For the partitions where the quotes table is missing, we use the fill_database method:

In [27]:

Copied!

db.fill_database()
db.fill_database()

Successfully filled missing tables to partition: :/tmp/tmp8o8bs72d/db/2020.01.02
Successfully filled missing tables to partition: :/tmp/tmp8o8bs72d/db/2020.01.01

Now that the database has resolved the missing tables within the partitions, we can view the new quotes table:

In [28]:

Copied!

db.quotes
db.quotes

Out[28]:

	date	sym	open	high	low	close

0	2020.01.03	AAPL	8.204026	0.9115201	3.916864	9.813545
1	2020.01.03	AAPL	8.092754	6.019578	0.08513137	2.825277
2	2020.01.03	AAPL	1.425043	8.881719	4.285461	7.820761
3	2020.01.03	AAPL	7.172736	3.33985	5.999403	3.010211
4	2020.01.03	AAPL	2.974185	1.559372	2.76356	5.182052
5	2020.01.03	AAPL	3.200759	7.485088	7.928813	6.437041
6	2020.01.03	AAPL	7.749599	5.559444	0.3300404	9.424896
7	2020.01.03	AAPL	4.885961	4.677432	8.288318	4.366883
8	2020.01.03	AAPL	7.412891	5.082189	9.214036	7.900838
9	2020.01.03	AAPL	6.625847	9.792139	6.208818	9.195079
10	2020.01.03	AAPL	2.075797	5.340321	0.4038709	0.7533655
11	2020.01.03	AAPL	4.797642	8.373317	4.98156	6.299731
12	2020.01.03	AAPL	0.8688765	1.967616	3.349573	4.094004
13	2020.01.03	AAPL	2.684143	0.05767352	8.878174	2.166685
14	2020.01.03	AAPL	3.181093	4.686113	0.8967613	7.39341
15	2020.01.03	AAPL	3.630268	0.4563809	2.89025	6.428857
16	2020.01.03	AAPL	7.342469	9.298404	7.098509	1.698009
17	2020.01.03	AAPL	1.293144	8.125834	7.214184	5.946857
18	2020.01.03	AAPL	8.051322	1.446192	9.436185	4.824975
19	2020.01.03	AAPL	1.018781	1.299401	1.18181	0.6091787
20	2020.01.03	AAPL	4.002909	4.115772	5.036211	1.680549
21	2020.01.03	AAPL	0.9864104	4.75085	0.5140735	2.468647
22	2020.01.03	AAPL	8.388561	6.170405	1.067153	2.034476
23	2020.01.03	AAPL	9.258428	7.146021	2.311644	4.770905
...	...	...	...	...	...	...
9999	2020.01.03	MSFT	2.832818	1.466171	3.457545	5.985203

10,000 rows × 6 columns

Finally, to view the amount of saved data, count the number of rows per partition using partition_count:

In [29]:

Copied!

db.partition_count()
db.partition_count()

Out[29]:

	quotes	trades

2020.01.01	0	500425
2020.01.02	0	499575
2020.01.03	10000	10000

Clean up temporary database created¶

In [30]:

Copied!

tempdir.cleanup()
tempdir.cleanup()