# Compacting the HDB sym file¶

Under some scenarios, the sym enum file sitting in the root of the HDB folder can become bloated with symbols no longer used since earlier parts of a HDB were archived.

To compact the file requires re-enumeration of all enumerated columns against a new empty sym file. That can take some time to execute; nothing else should read or write to the HDB whilst this is running.

The code below is for a simple HDB, with

• date partitions
• a single sym list
• only splayed tables

This is an all-or-nothing approach. Run the code below at your own risk.

Ensure you understand what it does, and test it against a dev HDB you are happy to destroy in the event of an error.

This should really ever only be a one-time process. If you find your sym file growing beyond reasonable size, you very likely have non-repeating strings which would be better stored as char vectors than symbols.

This process is not a fix for a poor choice of schema!

/cd hdb
/q
system "mv sym zym";
:sym set symbol$(); / create a new empty sym file files:key :.; dates:files where files like "????.??.??"; {[d] root:":",string d; tableNames:string key $root;
tableRoot:root,/:"/",/:tableNames;
files:raze {$x,/:"/",/:string key $x}each tableRoot;
files:files where not files like "*#";
types:type each get each files;
enumeratedFiles:files where types within 20 76h;
/ if we have more than one enum better get help
if[any types within 21 76h;'"too difficult"];
{
sym set get :zym;
s:get x;
a:attr s;
s:value s;
sym set get :sym;
s:a#.Q.en[:.;([]s:s)]s;
x set s;
-1 "re-enumerated ", string x;
}each enumeratedFiles;
}each dates

Remember to rm the zym file at the end of processing.

## Back up the sym file¶

The sym file is found in the root of your HDB. It is the key to the default enums.

Regularly back up the sym file outside the HDB.

Here’s some multi-threaded (can run single threaded) and more memory-intensive but hugely faster sym file rewrite code that handles partitioned and splayed tables and par.txt.

Note you lose the g#, which isn’t supported in threads, so you have to apply it later.

system"l ." /load the HDB - can change this if you don't start q from your HDB root

allpaths:{[dbdir;table]
/ from dbmaint.q + an extra check for paths that exist (to support .Q.bv)
files:key dbdir;
if[any files like"par.txt";
:raze allpaths[;table]each hsym each$read0()sv dbdir,par.txt]; files@:where files like"[0-9]*"; files:()sv'dbdir,'files,'table; files where 0<>(count key@)each files} sym:oldSym:get:sym /to unenumerate /sym files from parted tables symFiles:raze sv/:/:raze {allpaths[:.;x],/:\:exec c from meta[x] where t in "s"} peach tables[] where {1b~.Q.qp value x}each tables[] /sym files from splayed tables symFiles,:raze{ sv/: hsym[x],/:exec c from meta x where t in "s"}each tables[] where {0b~.Q.qp value x}each tables[] /symbol files we're dealing with - memory intensive allsyms:distinct raze {[file] distinct @[value get@;file;symbol$()]} peach symFiles

.Q.gc[] /memory intensive so gc

/
The preceding code makes no changes to the HDB.
You can estimate the savings with count[allsyms]%count sym.
The rest of the script makes changes; there is no going back once you start.
Let nothing write to the HDB while the script runs.
\

system"mv sym zym"      / make backup of sym file
:sym set symbol$() / reset sym file - scary part sym set get:sym .Q.en[:.;([]allsyms)] / enumerate all syms at once {[file] s:get file; / file contents a:first ps inter attr s; / attributes - due to nog# error in threads / can be just a:attr s if your version of kdb+ / supports setting g# in threads s:oldSymint$s;                   / unenumerate against old sym file
file set a#sym$s; / enumerate against new sym; add attrib; write 0N!"re-enumerated ", string file; } peach symFiles Multi-threaded sym rewrite code Take backups! Error writing file? In the multi-threaded script, a 'cast could happen if this line fails on a file: allsyms:distinct raze{[file] :distinct @[value get@;file;symbol$()] } peach symFiles;
/symbol files we're dealing with - memory intensive

So perhaps check the integrity of your HDB (perhaps change the above to help debug):

allsyms:distinct raze{[file]
:distinct @[value get@; file; {0N!(x;y); symbol\$()}[file;]]
} peach symFiles; `

would print the file and error.

It’s important to understand what’s going on, not just run the whole thing blindly.