Reading and Writing files with PyKX

`pykx.read`

`QReader`

QReader(q)

Read data using q.

csv

csv(
    path,
    types=None,
    delimiter=",",
    as_table=True,
    filter_type=None,
    filter_columns=None,
    custom=None,
)

Reads a CSV file as a table or dictionary.

Column types are guessed if not provided.

Parameters:

Name	Type	Description	Default
`path`	`Union[str, Path, k.SymbolAtom]`	The path to the CSV file.	required
`types`	`Optional[Union[bytes, k.CharAtom, k.CharVector]]`	Can be a dictionary of columns and their types or a `str`-like object of uppercase characters representing the types. Space is used to drop a column. If `None`, the types will be guessed using csvutil.q. A breakdown of this process is illustrated in the table below.	`None`
`delimiter`	`Union[str, bytes, k.CharAtom]`	A single character representing the delimiter between values.	`','`
`as_table`	`Union[bool, k.BooleanAtom]`	`True` if the first line of the CSV file should be treated as column names, in which case a `pykx.Table` is returned. If `False` a `pykx.List` of `pykx.Vector` is returned - one for each column in the CSV file.	`True`
`filter_type`	`Union[str, k.CharVector]`	Can be `basic`, `only`, or `like`. `basic` will not search for any types with the `extended` flag in [csvutil.q]. `only` will only process columns that are passed in `filter_columns`. `like` will only process columns that match a string pattern passed in `filter_columns`.	`None`
`filter_columns`	`Union[str, list, k.CharVector, k.SymbolAtom, k.SymbolVector]`	Used in tandem with `filter_type` when `only` or `like` is passed. `only` accepts str or list of str. `like` accepts only a str pattern.	`None`
`custom`	`dict`	A dictionary used to change default values in csvutil.q.	`None`

Returns:

Type	Description
`Union[k.Table, k.Dictionary]`	The CSV data as a `pykx.Table` or `pykx.List`, depending on the value of `as_table`.

See Also

q.write.csv

CSV Type Guessing Table

Type Character	Type	Condition(s)
*	List	- Any type of width greater than 30. - Remaining unknown types.
B	BooleanAtom	- Matching Byte or Char, maxwidth 1, no decimal points, at least 1 of `[0fFnN]` and 1 of `[1tTyY]` in columns. - Matching Byte or Char, maxwidth 1, no decimal points, all elements in `[01tTfFyYnN]`.
G	GUIDAtom	- Matches GUID-like structure. - Matches structure wrapped in `{ }`.
X	ByteAtom	- Maxwidth of 2, comprised of `[0-9]` AND `[abcdefABCDEF]`.
H	ShortAtom	- Matches Integer with maxwidth less than 7.
I	IntAtom	- Numerical of size between 7 and 15 with exactly 3 decimal points (IP Address). - Matches Long with maxwidth less than 12.
J	LongAtom	- Numerical, no decimal points, all elements `+-` or `0-9`.
E	RealAtom	- Matches float with maxwidth less than 9.
F	FloatAtom	- Numerical, maxwidth greater than 2, fewer than 2 decimal points, `/` present. - Numerical, fewer than 2 decimal points, maxwidth greater than 1.
C	CharAtom	- Empty columns. Remaining unknown types of size 1.
S	SymbolAtom	- Remaining unknown types of maxwidth 2-11 and granularity of less than 10.
P	TimestampAtom	- Numerical, maxwidth 11-29, fewer than 4 decimals matching `YYYY[./-]MM[./-]DD`
M	MonthAtom	- Matching either numerical, Int, Byte, Real or Float, fewer than 2 decimal points, maxwidth 4-7
D	DateAtom	- Matching Integer, maxwidth 6 or 8. - Numerical, 0 decimal points, maxwidth 8-10. - Numerical, 2 decimal points, maxwidth 8-10. - No decimal points maxwidth 5-9, matching date with 3 letter month code eg.(9nov1989).
N	TimespanAtom	- Numerical, maxwidth 15, no decimal points, all values `0-9`. - Numerical, maxwidth 3-29, 1 decimal point, matching `[0-9]D[0-9]`. - Numerical, maxwidth 3-28, 1 decimal point.
U	MinuteAtom	- Matching Byte, maxwidth 4, matching `[012][0-9][0-5][0-9]`. - Numerical, maxwidth 4 or 5, no decimal points, matching `*[0-9]:[0-5][0-9]`.
V	SecondAtom	- Matching Integer, maxwidth 6, matching `[012][0-9][0-5][0-9][0-5][0-9]`. - Matching Time, maxwidth 7 or 8, no decimal points.
T	TimeAtom	- Numerical, maxwidth 9, no decimal points, all values numeric. - Numerical, maxwidth 7 - 12, fewer than 2 decimal points, matching `[0-9]:[0-5][0-9]:[0-5][0-9]`. - Matching Real or Float, maxwidth 7-12, 1 decimal point, matching `[0-9][0-5][0-9][0-5][0-9]`.

Examples:

Read a comma seperated CSV file into a pykx.Table guessing the datatypes of each column.

table = q.read.csv('example.csv')

Read a tab seperated CSV file into a pykx.Table while specifying the columns datatypes to be a pykx.SymbolVector followed by two pykx.LongVector columns.

table = q.read.csv('example.csv', 'SJJ', '      ')

Read a comma separated CSV file into a pykx.Dictionary, guessing the datatypes of each column.

table = q.read.csv('example.csv', None, None, False)

Read a comma separated CSV file specifying the type of the three columns named x1, x2 and x3 to be of type Integer, GUID and Timestamp.

table = q.read.csv('example.csv', {'x1':kx.IntAtom,'x2':kx.GUIDAtom,'x3':kx.TimestampAtom})

Read a comma separated CSV file specifying only columns that include the word "name" in them.

table = q.read.csv('example.csv', filter_type = "like", filter_columns = '*name*')

Read a comma separated CSV file changing the guessing variables to change the number of lines read and used to guess the type of the column.

table = q.read.csv('example.csv', custom = {"READLINES":1000})

splayed

splayed(root, name)

Loads a splayed table.

Parameters:

Name	Type	Description	Default
`root`	`Union[str, Path, k.SymbolAtom]`	The path to the root directory of the splayed table.	required
`name`	`Union[str, k.SymbolAtom]`	The name of the table to read.	required

Returns:

Type	Description
`k.SplayedTable`	The splayed table as a `pykx.SplayedTable`.

See Also

q.write.splayed

Examples:

Reads a splayed table named t found within the current directory

table = q.read.splayed('.', 't')

Reads a splayed table named splayed found within the /tmp directory

table = q.read.splayed('/tmp', 'splayed')

fixed

fixed(path, types, widths)

Loads a file of typed data with fixed-width fields.

It is expected that there will either be a newline after every record, or none at all.

Parameters:

Name	Type	Description	Default
`path`	`Union[str, Path, k.SymbolAtom]`	The path to the file containing the fixed-width field data.	required
`types`	`Union[bytes, k.CharVector]`	A string of uppercase characters representing the types. Space is used to drop a column.	required
`widths`	`Union[List[int], k.LongVector]`	The width in bytes of each field.	required

Returns:

Type	Description
`k.List`	The data as a `pykx.List` with a `pykx.Vector` for each column.

Examples:

Read a file of fixed width data into a pykx.List of two pykx.LongVectors the first with a size of 1 character and the second with a size of 2 characters.

data = q.read.fixed('example_file', [b'J', b'J'], [1, 2])

json

json(path)

Reads a JSON file into a k.Table.

Parameters:

Name	Type	Description	Default
`path`	`Union[str, Path, k.SymbolAtom]`	The path to the JSON file.	required

Returns:

Type	Description
`JSONKTypes`	The JSON data as a `pykx.K` object.

See Also

q.write.json

Examples:

Read a JSON file.

data = q.read.json('example.json')

serialized

serialized(path)

Reads a binary file containing serialized q data.

Parameters:

Name	Type	Description	Default
`path`	`Union[str, Path, k.SymbolAtom]`	The path to the q data file.	required

Returns:

Type	Description
`k.K`	The q data file converted to a `pykx` object.

See Also

q.write.serialized

Examples:

Read a q data file containing a serialized table into a pykx.Table object.

table = q.read.serialized('q_table_file')

`pykx.write`

`QWriter`

QWriter(q)

Write data using q.

splayed

splayed(root, name, table)

Splays and writes a q table to disk.

Parameters:

Name	Type	Description	Default
`root`	`Union[str, Path, k.SymbolAtom]`	The path to the root directory of the splayed table.	required
`name`	`Union[str, k.SymbolAtom]`	The name of the table, which will be written to disk.	required
`table`	`Union[k.Table, pd.DataFrame]`	A table-like object to be written as a splayed table.	required

Returns:

Type	Description
`Path`	The path to the splayed table on disk.

See Also

q.read.splayed

Examples:

Write a pandas DataFrame to disk as a splayed table in the current directory.

df = pd.DataFrame([[x, 2 * x] for x in range(5)])
q.write.splayed('.', 'splayed_table', df)

Write a pykx.Table to disk as a splayed table at /tmp/splayed_table.

table = q('([] a: 10 20 30 40; b: 114 113 98 121)')
q.write.splayed('/tmp', 'splayed_table', table)

serialized

serialized(path, data)

Writes a q object to a binary data file using q serialization.

This method is a wrapper around the q function set, and as with any q function, arguments which are not pykx.K objects are automatically converted into them.

Parameters:

Name	Type	Description	Default
`path`	`Union[str, Path, k.SymbolAtom]`	The path to write the q object to.	required
`data`	`Any`	An object that will be converted to q, then serialized to disk.	required

Returns:

Type	Description
`Path`	A `pykx.SymbolAtom` that can be used as a file descriptor for the file.

See Also

q.read.serialized

Examples:

Serialize and write a pandas.DataFrame to disk in the current directory.

df = q('([] a: til 5; b: 2 * til 5)').pd()
q.write.serialized('serialized_table', df)

Serialize and write a Python int to disk in the current directory.

q.write.serialized('serialized_int', 145)

csv

csv(path, table, delimiter=',')

Writes a given table to a CSV file.

Parameters:

Name	Type	Description	Default
`path`	`Union[str, Path, k.SymbolAtom]`	The path to the CSV file.	required
`delimiter`	`Optional[Union[str, bytes, k.CharAtom]]`	A single character representing the delimeter between values.	`','`
`table`	`Union[k.Table, pd.DataFrame]`	A table like object to be written as a csv file.	required

Returns:

Type	Description
`Path`	A `pykx.SymbolAtom` that can be used as a file descriptor for the file.

See Also

q.read.csv

Examples:

Write a pandas DataFrame to disk as a csv file in the current directory using a comma as a seperator between values.

df = q('([] a: til 5; b: 2 * til 5)').pd()
q.write.csv('example.csv', df)

Write a pykx.Table to disk as a csv file in the current directory using a tab as a seperator between values.

table = q('([] a: 10 20 30 40; b: 114 113 98 121)')
q.write.csv('example.csv', table, '     ')

json

json(path, data)

Writes a JSON representation of the given q object to a file.

Parameters:

Name	Type	Description	Default
`path`	`Union[str, Path, k.SymbolAtom]`	The path to the JSON file.	required
`data`	`Any`	Any type to be serialized and written as a JSON file.	required

Returns:

Type	Description
`Path`	A `pykx.SymbolAtom` that can be used as a file descriptor for the file.

See Also

q.read.json

Examples:

Convert a pandas Dataframe to JSON and then write it to disk in the current directory.

df = q('([] a: til 5; b: 2 * til 5)').pd()
q.write.json('example.json', df)

Convert a Python int to JSON and then write it to disk in the current directory.

q.write.json('example.json', 143)

Convert a Python dictionary to JSON and then write it to disk in the current directory.

dictionary = {'a': 'hello', 'b':'pykx', 'c':2.71}
q.write.json('example.json', dictionary)