[pytables-users] trouble loading table with different shapes from h5py in pytables

Discussion:

Mark Mikofski

2018-08-07 22:28:16 UTC

Hi,

I apologize if this has been answered previously. I have created an h5
table using h5py:

import numpy as np
import pvlib
import h5py

# create some data
x = pvlib.pvsystem.singlediode(6.1, 1.2e-7, 0.012, 123, 1.23*60*0.026, 100)
y = pvlib.pvsystem.singlediode(5.1, 1.2e-7, 0.012, 123, 1.23*60*0.026, 100)

# set the dtypes to use as a structured array
my_dtype = np.dtype([
('i_l', float), ('i_0', float), ('r_s', float), ('r_sh', float), ('nNsVth', float),
('i_sc', float), ('v_oc', float), ('i_mp', float), ('v_mp', float),
('i', float, (1,100)), ('v', float, (1, 100))
])

# store the data in structured array, note that the IV curve is a nested
my_data = np.array([
(6.1, 1.2e-7, 0.012, 123, 1.23*60*0.026,
x['i_sc'], x['v_oc'], x['i_mp'], x['v_mp'], x['i'], x['v']),
(5.1, 1.2e-7, 0.012, 123, 1.23*60*0.026,
y['i_sc'], y['v_oc'], y['i_mp'], y['v_mp'], y['i'], y['v'])
], my_dtype)

# pretend that this is a grid of IV curves for matrix of (E, T)
your_data = np.copy(my_data)
# reshape my_data and your_data from (2,) to (1, 2),
# and concatentate to make fake grid
all_data = np.concatenate([my_data.reshape(1,2), your_data.reshape(1,2)], axis=0)

# output to a file
with h5py.File('THIS_IS_A_TEST_FILE.H5', 'w') as f:
f['data'] = all_data # key "data" is arbitrary, choose as many groups as you need

I can reload the table in h5py and using hfdview or h5ls, but if I try to

f = tables.open_file('THIS_IS_A_TEST_FILE.H5')

File(filename=THIS_IS_A_TEST_FILE.H5, title='', mode='r', root_uep='/',
filters=Filters(complevel=0, shuffle=False, bitshuffle=False,
fletcher32=False, least_significant_digit=None))
/ (RootGroup) ''
/data (Table(2,)) ''
description := {
"i_l": Float64Col(shape=(), dflt=0.0, pos=0),
"i_0": Float64Col(shape=(), dflt=0.0, pos=1),
"r_s": Float64Col(shape=(), dflt=0.0, pos=2),
"r_sh": Float64Col(shape=(), dflt=0.0, pos=3),
"nNsVth": Float64Col(shape=(), dflt=0.0, pos=4),
"i_sc": Float64Col(shape=(), dflt=0.0, pos=5),
"v_oc": Float64Col(shape=(), dflt=0.0, pos=6),
"i_mp": Float64Col(shape=(), dflt=0.0, pos=7),
"v_mp": Float64Col(shape=(), dflt=0.0, pos=8),
"i": Float64Col(shape=(1, 100), dflt=0.0, pos=9),
"v": Float64Col(shape=(1, 100), dflt=0.0, pos=10)}
byteorder := 'little'
chunkshape := (39,)

g = f.get_node('/data')
g[0]

---------------------------------------------------------------------------
HDF5ExtError Traceback (most recent call last)
<ipython-input-8-73ec0726ff88> in <module>()
----> 1 g[0]

~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in __getitem__(self, key)
2077 key += self.nrows
2078 (start, stop, step) = self._process_range(key, key + 1,
1)
-> 2079 return self.read(start, stop, step)[0]
2080 elif isinstance(key, slice):
2081 (start, stop, step) = self._process_range(

~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in read(self, start, stop, step, field, out)
1932 warn_negstep=False)
1933
-> 1934 arr = self._read(start, stop, step, field, out)
1935 return internal_to_flavor(arr, self.flavor)
1936

~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in _read(self, start, stop, step, field, out)
1846 # This optimization works three times faster than
1847 # the row._fill_col method (up to 170 MB/s on a pentium
IV @ 2GHz)
-> 1848 self._read_records(start, stop - start, result)
1849 # Warning!: _read_field_name should not be used until
1850 # H5TBread_fields_name in tableextension will be finished

tables\tableextension.pyx in tables.tableextension.Table._read_records()

HDF5ExtError: HDF5 error back trace

File "C:\ci\hdf5_1525883595717\work\src\H5Dio.c", line 216, in H5Dread
can't read data
File "C:\ci\hdf5_1525883595717\work\src\H5Dio.c", line 471, in H5D__read
src and dest data spaces have different sizes

End of HDF5 error back trace

Problems reading records.

Curiously, if I index at the g[1] it returns that row. Also, it doesn't get
the shape correct, it should be (2, 2), and so indexing by slice in
PyTables also doesn't work.

g.shape

(2,)

If I open the file in vitables-2 or 3 it gives this message, which is
irrelevant, since it's uncompressed, and if I change it to gzip, it says
the same thing, even though zlib is already installed.

Error: problems reading records. The dataset seems to be compressed with

the None library. Check that it is installed in your system, please.

ViTables properties shows the field names, their types and shapes
correctly, but not the table size, which it says is (2, ) not surprising
since it uses PyTables under the hood.

I have seen another similar post here "error selecting rows by list of
indices in multidimensional array" from 2014 with @scopatz
<https://groups.google.com/forum/#!topic/pytables-users/h9IjRLZhNEo> but it
didn't seem exactly the same, and it was so old, so I didn't tag it.

I also search pytables issues on GitHub but I didn't find any similar, and
didn't want to spam that if this is not a real issue.

Has anyone come across this issue or similar one? AFAICT HDF5 should handle
multiple dimension tables with compound fields OK, and h5py, hdfview, and
h5ls all can read them OK, but PyTables and ViTables can't. Why?

thanks,
Mark

--
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Francesc Alted

2018-08-09 06:44:13 UTC

Permalink

Hi Mark,

Yes, multimensional tables are not supported in PyTables (only
multimensional columns are). Probably an `NotImplementedError` should be
raised in this case. Could you open a ticket please?

Francesc

Post by Mark Mikofski
Hi,
I apologize if this has been answered previously. I have created an h5
import numpy as np
import pvlib
import h5py
# create some data
x = pvlib.pvsystem.singlediode(6.1, 1.2e-7, 0.012, 123, 1.23*60*0.026, 100)
y = pvlib.pvsystem.singlediode(5.1, 1.2e-7, 0.012, 123, 1.23*60*0.026, 100)
# set the dtypes to use as a structured array
my_dtype = np.dtype([
('i_l', float), ('i_0', float), ('r_s', float), ('r_sh', float), ('nNsVth', float),
('i_sc', float), ('v_oc', float), ('i_mp', float), ('v_mp', float),
('i', float, (1,100)), ('v', float, (1, 100))
])
# store the data in structured array, note that the IV curve is a nested
my_data = np.array([
(6.1, 1.2e-7, 0.012, 123, 1.23*60*0.026,
x['i_sc'], x['v_oc'], x['i_mp'], x['v_mp'], x['i'], x['v']),
(5.1, 1.2e-7, 0.012, 123, 1.23*60*0.026,
y['i_sc'], y['v_oc'], y['i_mp'], y['v_mp'], y['i'], y['v'])
], my_dtype)
# pretend that this is a grid of IV curves for matrix of (E, T)
your_data = np.copy(my_data)
# reshape my_data and your_data from (2,) to (1, 2),
# and concatentate to make fake grid
all_data = np.concatenate([my_data.reshape(1,2), your_data.reshape(1,2)], axis=0)
# output to a file
f['data'] = all_data # key "data" is arbitrary, choose as many groups as you need
I can reload the table in h5py and using hfdview or h5ls, but if I try to

f = tables.open_file('THIS_IS_A_TEST_FILE.H5')

g = f.get_node('/data')
g[0]

---------------------------------------------------------------------------
HDF5ExtError Traceback (most recent call last)
<ipython-input-8-73ec0726ff88> in <module>()
----> 1 g[0]
~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in __getitem__(self, key)
2077 key += self.nrows
2078 (start, stop, step) = self._process_range(key, key +
1, 1)
-> 2079 return self.read(start, stop, step)[0]
2081 (start, stop, step) = self._process_range(
~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in read(self, start, stop, step, field, out)
1932 warn_negstep=False)
1933
-> 1934 arr = self._read(start, stop, step, field, out)
1935 return internal_to_flavor(arr, self.flavor)
1936
~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in _read(self, start, stop, step, field, out)
1846 # This optimization works three times faster than
1847 # the row._fill_col method (up to 170 MB/s on a
-> 1848 self._read_records(start, stop - start, result)
1849 # Warning!: _read_field_name should not be used until
1850 # H5TBread_fields_name in tableextension will be finished
tables\tableextension.pyx in tables.tableextension.Table._read_records()
HDF5ExtError: HDF5 error back trace
File "C:\ci\hdf5_1525883595717\work\src\H5Dio.c", line 216, in H5Dread
can't read data
File "C:\ci\hdf5_1525883595717\work\src\H5Dio.c", line 471, in H5D__read
src and dest data spaces have different sizes
End of HDF5 error back trace
Problems reading records.
Curiously, if I index at the g[1] it returns that row. Also, it doesn't
get the shape correct, it should be (2, 2), and so indexing by slice in
PyTables also doesn't work.

g.shape

(2,)
If I open the file in vitables-2 or 3 it gives this message, which is
irrelevant, since it's uncompressed, and if I change it to gzip, it says
the same thing, even though zlib is already installed.
Error: problems reading records. The dataset seems to be compressed with

the None library. Check that it is installed in your system, please.

ViTables properties shows the field names, their types and shapes
correctly, but not the table size, which it says is (2, ) not surprising
since it uses PyTables under the hood.
I have seen another similar post here "error selecting rows by list of
<https://groups.google.com/forum/#!topic/pytables-users/h9IjRLZhNEo> but
it didn't seem exactly the same, and it was so old, so I didn't tag it.
I also search pytables issues on GitHub but I didn't find any similar, and
didn't want to spam that if this is not a real issue.
Has anyone come across this issue or similar one? AFAICT HDF5 should
handle multiple dimension tables with compound fields OK, and h5py,
hdfview, and h5ls all can read them OK, but PyTables and ViTables can't.
Why?
thanks,
Mark
--
You received this message because you are subscribed to the Google Groups
"pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.

--
Francesc Alted
--
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mark Mikofski

2018-08-10 17:53:59 UTC

Permalink

Hi Francesc, I've posted the issue here:
https://github.com/PyTables/PyTables/issues/706

I've tried to follow the code base, to File
<https://github.com/PyTables/PyTables/blob/376fc21/tables/file.py#L580> class
which is called by open_file, but no further, and it may some time before I
can trouble shoot this more, but I will contribute what I can.

Thanks!

Post by Francesc Alted
Hi Mark,
Yes, multimensional tables are not supported in PyTables (only
multimensional columns are). Probably an `NotImplementedError` should be
raised in this case. Could you open a ticket please?
Francesc

f = tables.open_file('THIS_IS_A_TEST_FILE.H5')

g = f.get_node('/data')
g[0]

---------------------------------------------------------------------------
HDF5ExtError Traceback (most recent call last)
<ipython-input-8-73ec0726ff88> in <module>()
----> 1 g[0]
~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in __getitem__(self, key)
2077 key += self.nrows
2078 (start, stop, step) = self._process_range(key, key +
1, 1)
-> 2079 return self.read(start, stop, step)[0]
2081 (start, stop, step) = self._process_range(
~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in read(self, start, stop, step, field, out)
1932
warn_negstep=False)
1933
-> 1934 arr = self._read(start, stop, step, field, out)
1935 return internal_to_flavor(arr, self.flavor)
1936
~\AppData\Local\Continuum\miniconda3\envs\py36\lib\site-packages\tables\table.py
in _read(self, start, stop, step, field, out)
1846 # This optimization works three times faster than
1847 # the row._fill_col method (up to 170 MB/s on a
-> 1848 self._read_records(start, stop - start, result)
1849 # Warning!: _read_field_name should not be used until
1850 # H5TBread_fields_name in tableextension will be finished
tables\tableextension.pyx in tables.tableextension.Table._read_records()
HDF5ExtError: HDF5 error back trace
File "C:\ci\hdf5_1525883595717\work\src\H5Dio.c", line 216, in H5Dread
can't read data
File "C:\ci\hdf5_1525883595717\work\src\H5Dio.c", line 471, in H5D__read
src and dest data spaces have different sizes
End of HDF5 error back trace
Problems reading records.
Curiously, if I index at the g[1] it returns that row. Also, it doesn't
get the shape correct, it should be (2, 2), and so indexing by slice in
PyTables also doesn't work.

g.shape

(2,)
If I open the file in vitables-2 or 3 it gives this message, which is
irrelevant, since it's uncompressed, and if I change it to gzip, it says
the same thing, even though zlib is already installed.
Error: problems reading records. The dataset seems to be compressed with

the None library. Check that it is installed in your system, please.

ViTables properties shows the field names, their types and shapes
correctly, but not the table size, which it says is (2, ) not surprising
since it uses PyTables under the hood.
I have seen another similar post here "error selecting rows by list of
<https://groups.google.com/forum/#!topic/pytables-users/h9IjRLZhNEo> but
it didn't seem exactly the same, and it was so old, so I didn't tag it.
I also search pytables issues on GitHub but I didn't find any similar,
and didn't want to spam that if this is not a real issue.
Has anyone come across this issue or similar one? AFAICT HDF5 should
handle multiple dimension tables with compound fields OK, and h5py,
hdfview, and h5ls all can read them OK, but PyTables and ViTables can't.
Why?
thanks,
Mark
--
You received this message because you are subscribed to the Google Groups
"pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
For more options, visit https://groups.google.com/d/optout.

--
Francesc Alted