2016-07-21 45 views
0

我有一個表(其中包括)下列:HDFStore:選擇是否列在陣列

>>> hdf.select('foo').columns 
Out[22]: 
Index(['bar', 'units'], 
     dtype='object') 

現在我想選擇那些bar有兩個值之一:

myBar = ['1500013010', '1500002071'] 
hdf.select('foo', 'bar in [{}]'.format(', '.join(myBar))) 

但我得到了這個異常,我暗示我不能使用「bar」作爲變量。

全部可變refrences的必須是 基準的軸(例如,「指數」或「列」),或一個data_column 當前定義的參考資料包括:索引,列

但不是專欄嗎?

Traceback (most recent call last): 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 4593, in generate 
    return Expr(where, queryables=q, encoding=self.table.encoding) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/pytables.py", line 516, in __init__ 
    self.terms = self.parse() 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 726, in parse 
    return self._visitor.visit(self.expr) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 316, in visit_Module 
    return self.visit(expr, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 319, in visit_Expr 
    return self.visit(node.value, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 627, in visit_Compare 
    return self.visit(binop) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 400, in visit_BinOp 
    op, op_class, left, right = self._possibly_transform_eq_ne(node) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 351, in _possibly_transform_eq_ne 
    left = self.visit(node.left, side='left') 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 413, in visit_Name 
    return self.term_type(node.id, self.env, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/pytables.py", line 38, in __init__ 
    super(Term, self).__init__(name, env, side=side, encoding=encoding) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/ops.py", line 57, in __init__ 
    self._value = self._resolve_name() 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/pytables.py", line 44, in _resolve_name 
    raise NameError('name {0!r} is not defined'.format(self.name)) 
NameError: name 'bar' is not defined 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code 
    exec(code_obj, self.user_global_ns, self.user_ns) 
    File "<ipython-input-21-75c9827e34f0>", line 1, in <module> 
    hdf.select('foo', 'bar in [{}]'.format(', '.join(bar))) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 680, in select 
    return it.get_result() 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 1364, in get_result 
    results = self.func(self.start, self.stop, where) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 673, in func 
    columns=columns, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 4021, in read 
    if not self.read_axes(where=where, **kwargs): 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 3222, in read_axes 
    self.selection = Selection(self, where=where, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 4580, in __init__ 
    self.terms = self.generate(where) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 4605, in generate 
    .format(where, ','.join(q.keys())) 
ValueError: The passed where expression: bar in [1500013010, 1500002071] 
      contains an invalid variable reference 
      all of the variable refrences must be a reference to 
      an axis (e.g. 'index' or 'columns'), or a data_column 
      The currently defined references are: index,columns 

回答

3

您的列(或多個)不被索引,因此不能搜索的,所以你不能在where參數使用它們。

演示:

In [131]: df = pd.DataFrame(np.random.randint(0,20,size=(5, 3)), columns=list('ABC')) 

In [132]: df 
Out[132]: 
    A B C 
0 19 4 18 
1 4 14 16 
2 17 13 9 
3 19 9 13 
4 16 8 10 

In [133]: fn = 'C:/temp/test.h5' 

In [134]: store = pd.HDFStore(fn) 

In [135]: store.append('df', df) 

In [136]: store.select('df', 'B > 10') 
--------------------------------------------------------------------------- 
... 
NameError: name 'B' is not defined 

During handling of the above exception, another exception occurred: 
... 
ValueError: The passed where expression: B > 10 
      contains an invalid variable reference 
      all of the variable refrences must be a reference to 
      an axis (e.g. 'index' or 'columns'), or a data_column 
      The currently defined references are: index,columns 

現在,讓我們嘗試用一個索引列:

In [137]: store.append('df_indexed', df, data_columns=True) 

In [139]: store.select('df_indexed', 'B > 10') 
Out[139]: 
    A B C 
1 4 14 16 
2 17 13 9 

如何檢查是否列被索引與否:

In [154]: store.get_storer('df_indexed').table.colindexes 
Out[154]: 
{ 
    "C": Index(6, medium, shuffle, zlib(1)).is_csi=False, 
    "index": Index(6, medium, shuffle, zlib(1)).is_csi=False, 
    "B": Index(6, medium, shuffle, zlib(1)).is_csi=False, 
    "A": Index(6, medium, shuffle, zlib(1)).is_csi=False} 

In [155]: store.get_storer('df').table.colindexes 
Out[155]: 
{ 
    "index": Index(6, medium, shuffle, zlib(1)).is_csi=False}