Evan
2016-10-16 05:00:15 UTC
I appears I am indexing on all columns instead of one or two---I believe
the latter would be far more optimized for searches.
With the following code, I am trying to index only on columns COL1 and
COL2: (based off the example in the
introduction: http://www.pytables.org/usersguide/introduction.html)
class MyTable(IsDescription):
COL1 = Int16Col()
COL2 = Int16Col()
COL3= StringCol(64)
COL4= StringCol(64)
COL5= StringCol(64)
COL6= StringCol(64)
COL7 = Int32Col()
# Open a file in write mode
h5file = open_file("file1.h5", mode = "w")
my_key = "key"
# Create group
group = h5file.create_group("/", "my_table")
table = h5file.create_table(group, my_key, MyTable, "table of values")
row = table.row
# user decides which indices to create
field1 = "COL1" # create index on column 1, COL1
field2 = "COL2" # create index on column 2, COL2
# import dictionary 'dictionary1"
for dict in dictionary1:
row["COL1"] = dict["COL1"]
row["COL2"] = dict["COL2"]
row["COL3"] = dict["COL3"]
row["COL4"] = dict["COL4"]
row["COL5"] = dict["COL5"]
row["COL6"] = dict["COL6"]
row["COL7"] = dict["COL7"]
# This injects the Record values
table.cols.field1.create_index()
table.cols.field2.create_index()
row.append()
# Flush the table buffers
table.flush()
queries if I only indexed on one/two columns, and queried those, right?
the latter would be far more optimized for searches.
With the following code, I am trying to index only on columns COL1 and
COL2: (based off the example in the
introduction: http://www.pytables.org/usersguide/introduction.html)
class MyTable(IsDescription):
COL1 = Int16Col()
COL2 = Int16Col()
COL3= StringCol(64)
COL4= StringCol(64)
COL5= StringCol(64)
COL6= StringCol(64)
COL7 = Int32Col()
# Open a file in write mode
h5file = open_file("file1.h5", mode = "w")
my_key = "key"
# Create group
group = h5file.create_group("/", "my_table")
table = h5file.create_table(group, my_key, MyTable, "table of values")
row = table.row
# user decides which indices to create
field1 = "COL1" # create index on column 1, COL1
field2 = "COL2" # create index on column 2, COL2
# import dictionary 'dictionary1"
for dict in dictionary1:
row["COL1"] = dict["COL1"]
row["COL2"] = dict["COL2"]
row["COL3"] = dict["COL3"]
row["COL4"] = dict["COL4"]
row["COL5"] = dict["COL5"]
row["COL6"] = dict["COL6"]
row["COL7"] = dict["COL7"]
# This injects the Record values
table.cols.field1.create_index()
table.cols.field2.create_index()
row.append()
# Flush the table buffers
table.flush()
"ValueError: Index(6, medium, shuffle, zlib(1)).is_csi=False for
column 'COL1' already exists. If you want to re-create it, please, try
with reindex() method better"
Where above am I indexing on all columns? Surely I would have most fastercolumn 'COL1' already exists. If you want to re-create it, please, try
with reindex() method better"
queries if I only indexed on one/two columns, and queried those, right?
--
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "pytables-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pytables-users+***@googlegroups.com.
To post to this group, send an email to pytables-***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.