Discussion:
corrupt hdf5 files?
Andrew Straw
2006-03-08 17:17:47 UTC
Permalink
Hi,

I looks like I may have lost the results of last nights' experiments by
doing a Ctrl-C to quit my Python program writing hdf5 files using
pytables rather than doing it nicely. :( Obviously, this is something I
won't do again in the future, but do I have any hope to rescue my hdf5
file? I presume it's only the last few rows in the tables that are
causing a problem, but even tools like h5ls don't work:

$ h5ls DATA20060307_204746.h5
DATA20060307_204746.h5: unable to open file

$ python
Python 2.3.5 (#2, Nov 3 2005, 02:44:38)
[GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import tables
h5file=tables.openFile('DATA20060307_204746.h5')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "build/bdist.linux-x86_64/egg/tables/File.py", line 225, in openFile
File "hdf5Extension.pyx", line 599, in hdf5Extension.File.__new__
IOError: file ``DATA20060307_204746.h5`` exists but it is not an HDF5 file



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Andrew Straw
2006-03-08 18:38:19 UTC
Permalink
Post by Andrew Straw
Hi,
I looks like I may have lost the results of last nights' experiments by
doing a Ctrl-C to quit my Python program writing hdf5 files using
pytables rather than doing it nicely. :(
FWIW, this was with the snapshot 20060306.


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Francesc Altet
2006-03-08 19:23:39 UTC
Permalink
El dc 08 de 03 del 2006 a les 09:17 -0800, en/na Andrew Straw va
Post by Andrew Straw
Hi,
I looks like I may have lost the results of last nights' experiments by
doing a Ctrl-C to quit my Python program writing hdf5 files using
pytables rather than doing it nicely. :( Obviously, this is something I
won't do again in the future, but do I have any hope to rescue my hdf5
file? I presume it's only the last few rows in the tables that are
Ooops! I'd like to be wrong, but if the file has got corrupted, I'm
afraid that you will not be able to retrieve your data anymore :-(

The situation is similar to switch off a filesystem without properly
unmounting it, with the additional burden that there are not, to my
knowledge, a repairing tool for corrupted HDF5 files. Right now, the
best technique to not loose many data when working with HDF5 files is to
take a defensive approach and create backup files from time to time.

I know that somebody in this list was also worried by these kind of
issues, and perhaps he can say more on this.

Sorry about that,
--
Post by Andrew Straw
0,0< Francesc Altet http://www.carabos.com/
V V Cárabos Coop. V. Enjoy Data
"-"




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
Andrew Straw
2006-03-08 21:25:42 UTC
Permalink
Post by Francesc Altet
El dc 08 de 03 del 2006 a les 09:17 -0800, en/na Andrew Straw va
Post by Andrew Straw
Hi,
I looks like I may have lost the results of last nights' experiments by
doing a Ctrl-C to quit my Python program writing hdf5 files using
pytables rather than doing it nicely. :( Obviously, this is something I
won't do again in the future, but do I have any hope to rescue my hdf5
file? I presume it's only the last few rows in the tables that are
Ooops! I'd like to be wrong, but if the file has got corrupted, I'm
afraid that you will not be able to retrieve your data anymore :-(
The situation is similar to switch off a filesystem without properly
unmounting it, with the additional burden that there are not, to my
knowledge, a repairing tool for corrupted HDF5 files. Right now, the
best technique to not loose many data when working with HDF5 files is to
take a defensive approach and create backup files from time to time.
I know that somebody in this list was also worried by these kind of
issues, and perhaps he can say more on this.
Well, I've implemented an exit function using the atexit.register()
function to properly close the file, at least where possible. I suspect
this would have saved me from my rather careless action, but I don't
plan on depending on this in the future -- I hope this will be a
last-resort feature of my code rather than a heavily-depended upon
feature. :)

Anyhow, perhaps pytables could do something similar internally --
maintain a list of open files and register its own exit function with
atexit.register() to make sure this sort of thing could be prevented in
the future? It won't prevent against all unexpected program closes (e.g.
signals) but my preliminary tests shows it does get called with ctrl-c.
It should only be a few lines of code.

Cheers!
Andrew


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Andrew Straw
2006-03-10 03:43:21 UTC
Permalink
El dc 08 de 03 del 2006 a les 13:25 -0800, en/na Andrew Straw va
Post by Andrew Straw
Well, I've implemented an exit function using the atexit.register()
function to properly close the file, at least where possible. I suspect
this would have saved me from my rather careless action, but I don't
plan on depending on this in the future -- I hope this will be a
last-resort feature of my code rather than a heavily-depended upon
feature. :)
Anyhow, perhaps pytables could do something similar internally --
maintain a list of open files and register its own exit function with
atexit.register() to make sure this sort of thing could be prevented in
the future? It won't prevent against all unexpected program closes (e.g.
signals) but my preliminary tests shows it does get called with ctrl-c.
It should only be a few lines of code.
Right, in fact I thought about something like this some time ago, but
never implemented it :-(
Well, thanks for the suggestion! We will definitely look forward to
include it in forthcoming 1.3 release.
That sounds great -- I look forward to it.

On another note, it seems like part of the issue is that I was used to
an older version of pytables (I think from 20050715 or so) in which the
.h5 files seemed to be in an internally consistent state 99+% of the
time, even while the app had them open. For example, I could make a copy
of the still-growing .h5 file and open it with my analysis tools, and I
never had the corrupt file issue. (Or do a simple h5ls on the file.)
Now, however, with the 20060306 snapshot, I haven't once been able to do
this. Can you think of anything different in pytables that may lead to
this behavior? Is there anything I can do in my application like call
h5file.sync() or something to attempt to force the file to be internally
consistent?

BTW, I don't think it's my hdf5 implementation that has changed -- I
haven't upgraded that. I'm using the standard 1.6.2 installed with
debian sarge.

Cheers!
Andrew


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
Ivan Vilata i Balaguer
2006-03-10 11:13:17 UTC
Permalink
Post by Andrew Straw
[...]
On another note, it seems like part of the issue is that I was used to
an older version of pytables (I think from 20050715 or so) in which the
.h5 files seemed to be in an internally consistent state 99+% of the
time, even while the app had them open. For example, I could make a copy
of the still-growing .h5 file and open it with my analysis tools, and I
never had the corrupt file issue. (Or do a simple h5ls on the file.)
Now, however, with the 20060306 snapshot, I haven't once been able to do
this. Can you think of anything different in pytables that may lead to
this behavior? Is there anything I can do in my application like call
h5file.sync() or something to attempt to force the file to be internally
consistent?
[...]
I wouldn't dare to say that copying an open HDF5 file using an external
process should be safe, but I still think I could explain the reason why
you had better luck with the old version. Before PyTables 1.2, leaves
got closed as soon as read and write operations finished, so as to keep
memory consumption low--remember that PyTables kept *all* nodes in
memory. From 1.2 on, only a small amount of nodes remain in memory
(thanks to the LRU cache), and they are kept *open* to reduce the
overhead of opening and closing them each time they are accessed. This
is more space and time-efficient, but keeping the nodes open may mean
that the HDF5 file remains in a consistent state less time.

As a simple test, you could try to explicitly close each of the leaves
you have open (without closing the file itself), and then try to copy
the file and check if it is OK. If this is the case, then maybe my
suspicion is right.

Anyway, I would still not recommend copying an open file like that. If
you have any ideas to make the situation better, they would be most welcome.

Regards,

::

Ivan Vilata i Balaguer >qo< http://www.carabos.com/
Cárabos Coop. V. V V Enjoy Data
""
Francesc Altet
2006-03-17 12:44:06 UTC
Permalink
Post by Andrew Straw
Well, I've implemented an exit function using the atexit.register()
function to properly close the file, at least where possible. I suspect
this would have saved me from my rather careless action, but I don't
plan on depending on this in the future -- I hope this will be a
last-resort feature of my code rather than a heavily-depended upon
feature. :)
Right, in fact I thought about something like this some time ago, but
never implemented it :-(
Well, thanks for the suggestion! We will definitely look forward to
include it in forthcoming 1.3 release.
Done in SVN.

Cheers,
--
0,0< Francesc Altet     http://www.carabos.com/
V V Cárabos Coop. V.   Enjoy Data
"-"



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
Francesc Altet
2006-03-09 08:00:15 UTC
Permalink
El dc 08 de 03 del 2006 a les 13:25 -0800, en/na Andrew Straw va
Post by Andrew Straw
Well, I've implemented an exit function using the atexit.register()
function to properly close the file, at least where possible. I suspect
this would have saved me from my rather careless action, but I don't
plan on depending on this in the future -- I hope this will be a
last-resort feature of my code rather than a heavily-depended upon
feature. :)
Anyhow, perhaps pytables could do something similar internally --
maintain a list of open files and register its own exit function with
atexit.register() to make sure this sort of thing could be prevented in
the future? It won't prevent against all unexpected program closes (e.g.
signals) but my preliminary tests shows it does get called with ctrl-c.
It should only be a few lines of code.
Right, in fact I thought about something like this some time ago, but
never implemented it :-(

Well, thanks for the suggestion! We will definitely look forward to
include it in forthcoming 1.3 release.

Cheers,
--
Post by Andrew Straw
0,0< Francesc Altet http://www.carabos.com/
V V Cárabos Coop. V. Enjoy Data
"-"




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
Loading...