2013-11-01 18 views
0

是預期的以下行爲還是錯誤?在片段開始和結束可能超出邊界時在熊貓的臨時索引上切片

我有一個過程,我需要來自Dataframe的行,但在boudary conditons中,簡單的規則(前5天的所有行都會在索引的外部部分或全部生成選擇,我希望大熊貓的行爲像Python一樣,總是返回即使有時沒有行的幀。

該指數是期索引和數據進行排序。

配置是帕納斯12 numpy的1.7和窗口64

在測試我有df.loc加註索引錯誤,如果請求的片不完全與int索引

DF [開始:結束]返回的幀,但並不總是行我預期

import pandas as pd 
october = pd.PeriodIndex(start = '20131001', end = '20131010', freq = 'D') 
oct_sales =pd.DataFrame(dict(units=[100+ i for i in range(10)]), index =october) 

#returns empty frame as desired 
oct_sales['2013-09-01': '2013-09-30'] 

# empty dataframe -- I was expecting two rows 
oct_sales['2013-09-30': '2013-10-02'] 

# works as expected 
oct_sales['2013-10-01': '2013-10-02'] 

# same as oct_sales['2013-10-02':] -- expected no rows 
oct_sales['2013-10-02': '2013-09-30'] 

回答

1

這被預期。標籤上的切片(開始:結束)僅在標籤存在時纔有效。爲了得到我認爲你在整個時期的reindex後,選擇,然後dropna。也就是說,loc行爲的提升是正確的,而[]索引應該工作(也許是一個錯誤)。

In [23]: idx = pd.PeriodIndex(start = '20130901', end = '20131010', freq = 'D') 

In [24]: oct_sales.reindex(idx) 
Out[24]: 
      units 
2013-09-01 NaN 
2013-09-02 NaN 
2013-09-03 NaN 
2013-09-04 NaN 
2013-09-05 NaN 
2013-09-06 NaN 
2013-09-07 NaN 
2013-09-08 NaN 
2013-09-09 NaN 
2013-09-10 NaN 
2013-09-11 NaN 
2013-09-12 NaN 
2013-09-13 NaN 
2013-09-14 NaN 
2013-09-15 NaN 
2013-09-16 NaN 
2013-09-17 NaN 
2013-09-18 NaN 
2013-09-19 NaN 
2013-09-20 NaN 
2013-09-21 NaN 
2013-09-22 NaN 
2013-09-23 NaN 
2013-09-24 NaN 
2013-09-25 NaN 
2013-09-26 NaN 
2013-09-27 NaN 
2013-09-28 NaN 
2013-09-29 NaN 
2013-09-30 NaN 
2013-10-01 100 
2013-10-02 101 
2013-10-03 102 
2013-10-04 103 
2013-10-05 104 
2013-10-06 105 
2013-10-07 106 
2013-10-08 107 
2013-10-09 108 
2013-10-10 109 

In [25]: oct_sales.reindex(idx)['2013-09-30':'2013-10-02'] 
Out[25]: 
      units 
2013-09-30 NaN 
2013-10-01 100 
2013-10-02 101 

In [26]: oct_sales.reindex(idx)['2013-09-30':'2013-10-02'].dropna() 
Out[26]: 
      units 
2013-10-01 100 
2013-10-02 101