Discussion:
Tables vs Arrays
David Reed
2013-07-29 01:38:45 UTC
Permalink
I'm really trying to become more productive using PyTables, but am
struggling with what I should be using. Whats the difference between a
table and an array?
Anthony Scopatz
2013-07-29 01:58:00 UTC
Permalink
Post by David Reed
I'm really trying to become more productive using PyTables, but am
struggling with what I should be using. Whats the difference between a
table and an array?
Hi David,

The difference between Arrays and Tables, conceptually is the same as the
different between numpy arrays and numpy structured arrays. The plain old
[Aa]rray is a continuous block of a single data type. Tables and
structured arrays have a more complex data type that is composed of a
continuous sequence of other data types (ie the fields / columns). Which
data structure you use really depends a lot of the type of problem you are
trying to solve and what kinds of questions you want to answer with that
data structure.

That said, the implementation of Tables is far more similar to EArrays than
Arrays. So a lot of the performance trade offs that you see are similar.

You should watch my "HDF5 is for Lovers" talk for more generic advice [1].
I hope this helps!

Be Well
Anthony

1.

Post by David Reed
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
https://lists.sourceforge.net/lists/listinfo/pytables-users
Francesc Alted
2013-07-29 02:19:04 UTC
Permalink
Post by David Reed
I'm really trying to become more productive using PyTables, but am
struggling with what I should be using. Whats the difference
between a table and an array?
Hi David,
The difference between Arrays and Tables, conceptually is the same as
the different between numpy arrays and numpy structured arrays. The
plain old [Aa]rray is a continuous block of a single data type.
Tables and structured arrays have a more complex data type that is
composed of a continuous sequence of other data types (ie the fields /
columns). Which data structure you use really depends a lot of the
type of problem you are trying to solve and what kinds of questions
you want to answer with that data structure.
That said, the implementation of Tables is far more similar to EArrays
than Arrays. So a lot of the performance trade offs that you see are
similar.
Besides this, another interesting difference is that Tables allow
queries to be performed in a similar way to relational databases (but
using a more NumPy-esque syntax). Here it is some examples:

http://pytables.github.io/cookbook/hints_for_sql_users.html?highlight=query#selecting-data

and you can index columns too:

http://pytables.github.io/cookbook/hints_for_sql_users.html?highlight=query#creating-an-index

so that you can accelerate queries involving indexed columns.
--
Francesc Alted
Loading...