Discussion:
Compression and Indexing in PyTables
David Reed
2013-07-28 13:24:04 UTC
Permalink
Hi there, I was wondering if there any nice tutorials that show the
different compression options such as zlib, bzo, etc. and how to actually
use them with my tables.

There seems to be a lot of good information describing the performance
increase under the Optimization Tips section, but I don't see any clear way
of actually doing this.

Maybe I'm missing something.

Thanks for the help.

-Dave
Andreas Hilboll
2013-07-28 13:48:05 UTC
Permalink
Hi David,
Post by David Reed
Hi there, I was wondering if there any nice tutorials that show the
different compression options such as zlib, bzo, etc. and how to
actually use them with my tables.
There seems to be a lot of good information describing the performance
increase under the Optimization Tips section, but I don't see any clear
way of actually doing this.
Maybe I'm missing something.
Maybe you're missing this:

http://pandas.pydata.org/pandas-docs/stable/io.html#compression

The HDFStore constructor has a "complib" kwarg which you can use to set
the compression library. Also look at "complevel" to set the compression
efficiency.

-- Andreas.
David Reed
2013-07-28 14:21:51 UTC
Permalink
maybe I wasn't aware of this, but has PANDAS completely wrapped PyTables,
or is PyTables something I should still be using for storing and accessing
scientific data, and PANDAS has an access point to it?
Post by Andreas Hilboll
Hi David,
Post by David Reed
Hi there, I was wondering if there any nice tutorials that show the
different compression options such as zlib, bzo, etc. and how to
actually use them with my tables.
There seems to be a lot of good information describing the performance
increase under the Optimization Tips section, but I don't see any clear
way of actually doing this.
Maybe I'm missing something.
http://pandas.pydata.org/pandas-docs/stable/io.html#compression
The HDFStore constructor has a "complib" kwarg which you can use to set
the compression library. Also look at "complevel" to set the compression
efficiency.
-- Andreas.
Francesc Alted
2013-07-28 15:23:46 UTC
Permalink
Post by David Reed
maybe I wasn't aware of this, but has PANDAS completely wrapped
PyTables, or is PyTables something I should still be using for storing
and accessing scientific data, and PANDAS has an access point to it?
Yeah, more the later than the former. PyTables is an standalone
library, but Pandas uses it as another storage backend.
--
Francesc Alted
Francesc Alted
2013-07-28 15:21:31 UTC
Permalink
Post by David Reed
Hi there, I was wondering if there any nice tutorials that show the
different compression options such as zlib, bzo, etc. and how to
actually use them with my tables.
There seems to be a lot of good information describing the performance
increase under the Optimization Tips section, but I don't see any
clear way of actually doing this.
Maybe I'm missing something.
Well, the compression options are part of the more general Filters
helper class:

http://pytables.github.io/usersguide/libref/helper_classes.html#the-filters-class

This stems from the fact that in HDF5 a compressor is just like another
data filter.
--
Francesc Alted
Francesc Alted
2013-07-28 17:38:44 UTC
Permalink
More input from Jeff Reback.

Hey Jeff, I see that you try to post here from time to time, but your
messages bounce because the address that you use as sender is not
subscribed. Please make sure that you post from a subscribed address.
Thanks!

Francesc
The attached message has been automatically discarded.
Re: [Pytables-users] Compression and Indexing in PyTables.eml

Subject:
Re: [Pytables-users] Compression and Indexing in PyTables
From:
Jeff Reback <***@yahoo.com>
Date:
7/28/13 11:35 AM

To:
Discussion list for PyTables <pytables-***@lists.sourceforge.net>


pandas stores using Pytables
and embeds extra meta data in the attributes to enable deserialization to the original pandas structure
Post by David Reed
maybe I wasn't aware of this, but has PANDAS completely wrapped
PyTables, or is PyTables something I should still be using for storing
and accessing scientific data, and PANDAS has an access point to it?
Yeah, more the later than the former. PyTables is an standalone
library, but Pandas uses it as another storage backend.
--
Francesc Alted
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
https://lists.sourceforge.net/lists/listinfo/pytables-users
--
Francesc Alted
Loading...