[pytables-users] Sum over column with tables.Expr?
Dav Clark
2016-05-17 22:40:00 UTC
Scopatz already seemingly answered this question here:


So, I tried some stuff. None of it works in a way that makes sense. For
example, i have a table in `spy`:

# This works and computes what I want, taking a few seconds for something
# I need to do this literally gabillions of times
np.mean(list(x['Bid_Price'] for x in spy.iterrows()))

# These are errors

# So I get the column in a variable:
spy_bp = spy.cols.Bid_Price

# This returns the column - each element

# These are "ValueError: reduction axis is out of bounds"
tb.Expr('sum(spy_bp, axis=0)')
tb.Expr('sum(spy_bp, axis=2)')

# So I can do this...
# But that's bizarre and also slower than the first line above

# This is also slower (but does the same general thing)

So what was Scopatz suggesting in the above link?

You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Francesc Alted
2016-05-19 09:35:43 UTC
Hi Dav,

Yes, it seems that PyTables does not implement reductions well via numexpr:

In [1]: import numpy as np

In [2]: import tables

In [3]: f = tables.open_file("test.h5", "w")

In [4]: a = f.create_array('/', 'a', np.array([1,2,3]))

In [5]: tables.Expr('sum(a)').eval()
Out[5]: array([6, 6, 6]) # nope!

I put more effort in implementing reductions in the bcolz package:

In [6]: import bcolz

In [7]: a2 = bcolz.carray([1,2,3])

In [8]: bcolz.eval('sum(a2)')
Out[8]: 6

A possible solution is to move the reduction support in bcolz to PyTables.
Could you open a ticket on this?

Post by Dav Clark
So, I tried some stuff. None of it works in a way that makes sense. For
# This works and computes what I want, taking a few seconds for something
# I need to do this literally gabillions of times
np.mean(list(x['Bid_Price'] for x in spy.iterrows()))
# These are errors
spy_bp = spy.cols.Bid_Price
# This returns the column - each element
# These are "ValueError: reduction axis is out of bounds"
tb.Expr('sum(spy_bp, axis=0)')
tb.Expr('sum(spy_bp, axis=2)')
# So I can do this...
# But that's bizarre and also slower than the first line above
# This is also slower (but does the same general thing)
So what was Scopatz suggesting in the above link?
You received this message because you are subscribed to the Google Groups
"pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
Francesc Alted
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Dav Clark
2016-05-23 19:17:54 UTC
I've created an issue here:


You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive: