Discussion:
[pytables-users] HDF5 file size (using copy_node)
Ken Walker
2017-07-11 15:41:07 UTC
Permalink
Hello Group,
I am new to HDF5 and PyTables , so pardon the newbie question.
I couldn't find an answer in the PyTables library reference, so querying
the group.
I am experimenting with file.copy_node() method to copy a node and child
datasets from one HDF5 file to another. After copying, I then delete the
node and datasets in the original file. It works as expected (in the sense
I have the data where I want it).
However, I discovered the 2 resulting files are larger than the original
file.
Original file size: 118 KB
Modified file (table deleted): 118 KB (no change)
new file (w/ copied table): 153 KB
Total: 271 KB

Is there something I need to do to remove unused space in the file?

These are small files for testing. I realize typical HDF5 applications can
be far larger.
Thanks in advance,
-Ken
--
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Francesc Alted
2017-07-12 08:49:09 UTC
Permalink
Hi Ken,

Yes, in order to remove unused space in HDF5 files you need to 'repack'
your file. You can do that with the generic `h5repack` command line
utility that comes with HDF5 or with `ptrepack` (http://www.pytables.org/
usersguide/utilities.html#ptrepack) another command line utility that comes
with PyTables. Both do the same thing but in different ways (ptrepack is
more specific for PyTables indeed), so use the one that looks better to you.

Francesc
Post by Ken Walker
Hello Group,
I am new to HDF5 and PyTables , so pardon the newbie question.
I couldn't find an answer in the PyTables library reference, so querying
the group.
I am experimenting with file.copy_node() method to copy a node and child
datasets from one HDF5 file to another. After copying, I then delete the
node and datasets in the original file. It works as expected (in the sense
I have the data where I want it).
However, I discovered the 2 resulting files are larger than the original
file.
Original file size: 118 KB
Modified file (table deleted): 118 KB (no change)
new file (w/ copied table): 153 KB
Total: 271 KB
Is there something I need to do to remove unused space in the file?
These are small files for testing. I realize typical HDF5 applications can
be far larger.
Thanks in advance,
-Ken
--
You received this message because you are subscribed to the Google Groups
"pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
Francesc Alted
--
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...