Ken Walker
2018-10-30 00:28:59 UTC
This is related to my previous question about H5Tpack().
I am working thru a problem reading/writing data with Pytables. I read some
data rows from one HDF5 file/dataset into a numpy record array, then write
that array to a dataset in a different HDF5 file (no change to the data).
The data in the new file looks fine when interrogated with Pytables or
viewed with HDFView. However, a downstream C++ app can't read the Pytables
data.
I am told (by the developers) that the compiler for the upstream program is
set to pad the data when it writes the original file (that I am reading),
and the pad is expected by the downstream reader (that reads the file I
created). Padding adds 4 pad characters to the a 4 byte S4 field so the
next field starts at an 8 byte memory boundary. Based on observed behavior,
they have inferred that Pytables removes the pad characters when reading
the dataset, and does not add a pad when writing the new dataset. (all
perfectly legal in hdf5 and does not affect data integrity). However the
missing pad is expected by the downstream reader, and causes an error (I
know, bad code design).
So....I'm wondering...is there something in Pytables that controls padding
when reading/writing datasets like this?
FYI, I recreated this read/write process with h5py, and the output file is
compatible with my downstream app. Apparently h5py retains the padded
characters. This is confirmed when I write the dataset.dtype: h5py reports
itemsize:384, vs itemsize:380 when Pytables reads the dataset.
I could rewrite my utility with h5py...but hope to avoid (if possible)
because I leverage a lot of pytables unique functionality.
Thanks in advance for any insights into this quirky padding behavior.
-Ken
I am working thru a problem reading/writing data with Pytables. I read some
data rows from one HDF5 file/dataset into a numpy record array, then write
that array to a dataset in a different HDF5 file (no change to the data).
The data in the new file looks fine when interrogated with Pytables or
viewed with HDFView. However, a downstream C++ app can't read the Pytables
data.
I am told (by the developers) that the compiler for the upstream program is
set to pad the data when it writes the original file (that I am reading),
and the pad is expected by the downstream reader (that reads the file I
created). Padding adds 4 pad characters to the a 4 byte S4 field so the
next field starts at an 8 byte memory boundary. Based on observed behavior,
they have inferred that Pytables removes the pad characters when reading
the dataset, and does not add a pad when writing the new dataset. (all
perfectly legal in hdf5 and does not affect data integrity). However the
missing pad is expected by the downstream reader, and causes an error (I
know, bad code design).
So....I'm wondering...is there something in Pytables that controls padding
when reading/writing datasets like this?
FYI, I recreated this read/write process with h5py, and the output file is
compatible with my downstream app. Apparently h5py retains the padded
characters. This is confirmed when I write the dataset.dtype: h5py reports
itemsize:384, vs itemsize:380 when Pytables reads the dataset.
I could rewrite my utility with h5py...but hope to avoid (if possible)
because I leverage a lot of pytables unique functionality.
Thanks in advance for any insights into this quirky padding behavior.
-Ken
--
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.