2017-05-19 61 views
1

爲什麼我可以在這種情況下按月選擇,而不是按日期選擇?蟒蛇 - 熊貓:如何按日期選擇

dates = pd.date_range(start = "01/01/1931" , end = "01/02/1941") 
new_df_4 = new_df_3.reindex(dates) 
new_df_4["1931-10"][![enter image description here][1]][1] 

enter image description here

但是,這並不工作:

new_df_4["1931-10-02"] 

KeyError異常回溯(最近通話最後一個) 在() ----> 1 new_df_4 [ 「1931-10-02」]

/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in __getitem__(self, key) 
    1990    return self._getitem_multilevel(key) 
    1991   else: 
-> 1992    return self._getitem_column(key) 
    1993 
    1994  def _getitem_column(self, key): 

/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in _getitem_column(self, key) 
    2002   result = self._constructor(self._data.get(key)) 
    2003   if result.columns.is_unique: 
-> 2004    result = result[key] 
    2005 
    2006   return result 

/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in __getitem__(self, key) 
    1990    return self._getitem_multilevel(key) 
    1991   else: 
-> 1992    return self._getitem_column(key) 
    1993 
    1994  def _getitem_column(self, key): 

/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in _getitem_column(self, key) 
    1997   # get column 
    1998   if self.columns.is_unique: 
-> 1999    return self._get_item_cache(key) 
    2000 
    2001   # duplicate columns & possible reduce dimensionality 

/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in _get_item_cache(self, item) 
    1343   res = cache.get(item) 
    1344   if res is None: 
-> 1345    values = self._data.get(item) 
    1346    res = self._box_item_values(item, values) 
    1347    cache[item] = res 

/Users/romain/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in get(self, item, fastpath) 
    3223 
    3224    if not isnull(item): 
-> 3225     loc = self.items.get_loc(item) 
    3226    else: 
    3227     indexer = np.arange(len(self.items))[isnull(self.items)] 

/Users/romain/anaconda/lib/python2.7/site-packages/pandas/indexes/base.pyc in get_loc(self, key, method, tolerance) 
    1876     return self._engine.get_loc(key) 
    1877    except KeyError: 
-> 1878     return self._engine.get_loc(self._maybe_cast_indexer(key)) 
    1879 
    1880   indexer = self.get_indexer([key], method=method, tolerance=tolerance) 

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4027)() 

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3891)() 

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12408)() 

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12359)() 

KeyError: '1931-10-02' 

回答

3

對於選擇按月使用partial string indexing

print (new_df_4["1931-10"]) 

如果分辨率相同(從same docs)這是行不通的:

警告 但是,如果該字符串處理作爲完全匹配,DataFrame的[]中的 選擇將按列而不是按行,請參閱 索引基礎。例如dft_minute ['2011-12-31 23:59']將會引發 KeyError,因爲'2012-12-31 23:59'與索引具有相同的分辨率,而 沒有具有此類名稱的列:要始終具有明確的 選擇,無論該行是作爲切片還是單個切片 選擇,請使用.loc。

In [95]: dft_minute.loc['2011-12-31 23:59'] 
Out[95]: 
a 1 
b 4 
Name: 2011-12-31 23:59:00, dtype: int64 

您可以使用loc如果需要按日期選擇:

new_df_4.loc["1931-10-02"] 

樣品:

np.random.seed(10) 
dates = pd.date_range(start = "01/01/1931" , end = "01/02/1941") 
new_df_4 = pd.DataFrame({'a':np.random.randint(10, size=len(dates))}, index=dates) 
print (new_df_4.head()) 
      a 
1931-01-01 9 
1931-01-02 4 
1931-01-03 0 
1931-01-04 1 
1931-01-05 9 

print (new_df_4["1931-10"]) 
      a 
1931-10-01 9 
1931-10-02 6 
1931-10-03 9 
1931-10-04 7 
1931-10-05 8 
1931-10-06 0 
1931-10-07 9 
1931-10-08 6 
1931-10-09 0 
1931-10-10 1 
1931-10-11 0 
... 

print (new_df_4.loc["1931-10-02"]) 
a 6 
Name: 1931-10-02 00:00:00, dtype: int32 
+0

這是一個轉機,但不回答這個問題: -/ –

+0

對不起,我編輯答案。 – jezrael

+1

@ayhan - 謝謝。 – jezrael