如果sort_index
無法使用,那麼它是一個有點複雜 - 需要通過split
創建幫手DataFrame
,然後sort_values
和最後reindex
:
idx = pd.Index([u'2015-03-28_p001_2', u'2015-03-29_p001_2',
u'2015-03-30_p001_2', u'2015-03-31_p001_2',
u'2015-03-31_p002_3', u'2015-04-01_p001_2',
u'2015-04-01_p002_3', u'2015-04-02_p001_2',
u'2015-04-02_p002_3', u'2015-04-03_p001_2',
u'2016-03-31_p127_1', u'2016-04-01_p127_1',
u'2016-04-01_p128_3', u'2016-04-02_p127_1',
u'2016-04-02_p128_3', u'2016-04-03_p127_1',
u'2016-04-03_p128_3', u'2016-04-04_p127_1',
u'2016-04-05_p127_1', u'2016-04-06_p127_1'])
df = pd.DataFrame({'a':range(len(idx))}, index=idx)
print (df)
a
2015-03-28_p001_2 0
2015-03-29_p001_2 1
2015-03-30_p001_2 2
2015-03-31_p001_2 3
2015-03-31_p002_3 4
2015-04-01_p001_2 5
2015-04-01_p002_3 6
2015-04-02_p001_2 7
2015-04-02_p002_3 8
2015-04-03_p001_2 9
2016-03-31_p127_1 10
2016-04-01_p127_1 11
2016-04-01_p128_3 12
2016-04-02_p127_1 13
2016-04-02_p128_3 14
2016-04-03_p127_1 15
2016-04-03_p128_3 16
2016-04-04_p127_1 17
2016-04-05_p127_1 18
2016-04-06_p127_1 19
df = df.sort_index()
print (df)
a
2015-03-28_p001_2 0
2015-03-29_p001_2 1
2015-03-30_p001_2 2
2015-03-31_p001_2 3
2015-03-31_p002_3 4
2015-04-01_p001_2 5
2015-04-01_p002_3 6
2015-04-02_p001_2 7
2015-04-02_p002_3 8
2015-04-03_p001_2 9
2016-03-31_p127_1 10
2016-04-01_p127_1 11
2016-04-01_p128_3 12
2016-04-02_p127_1 13
2016-04-02_p128_3 14
2016-04-03_p127_1 15
2016-04-03_p128_3 16
2016-04-04_p127_1 17
2016-04-05_p127_1 18
2016-04-06_p127_1 19
df1 = df.index.to_series().str.split('_', expand=True)
df1[0] = pd.to_datetime(df1[0])
#if necessary change order columns for sorting
df1 = df1.sort_values(by=[1,2,0])
print (df1)
0 1 2
2015-03-28_p001_2 2015-03-28 p001 2
2015-03-29_p001_2 2015-03-29 p001 2
2015-03-30_p001_2 2015-03-30 p001 2
2015-03-31_p001_2 2015-03-31 p001 2
2015-04-01_p001_2 2015-04-01 p001 2
2015-04-02_p001_2 2015-04-02 p001 2
2015-04-03_p001_2 2015-04-03 p001 2
2015-03-31_p002_3 2015-03-31 p002 3
2015-04-01_p002_3 2015-04-01 p002 3
2015-04-02_p002_3 2015-04-02 p002 3
2016-03-31_p127_1 2016-03-31 p127 1
2016-04-01_p127_1 2016-04-01 p127 1
2016-04-02_p127_1 2016-04-02 p127 1
2016-04-03_p127_1 2016-04-03 p127 1
2016-04-04_p127_1 2016-04-04 p127 1
2016-04-05_p127_1 2016-04-05 p127 1
2016-04-06_p127_1 2016-04-06 p127 1
2016-04-01_p128_3 2016-04-01 p128 3
2016-04-02_p128_3 2016-04-02 p128 3
2016-04-03_p128_3 2016-04-03 p128 3
df = df.reindex(df1.index)
print (df)
a
2015-03-28_p001_2 0
2015-03-29_p001_2 1
2015-03-30_p001_2 2
2015-03-31_p001_2 3
2015-04-01_p001_2 5
2015-04-02_p001_2 7
2015-04-03_p001_2 9
2015-03-31_p002_3 4
2015-04-01_p002_3 6
2015-04-02_p002_3 8
2016-03-31_p127_1 10
2016-04-01_p127_1 11
2016-04-02_p127_1 13
2016-04-03_p127_1 15
2016-04-04_p127_1 17
2016-04-05_p127_1 18
2016-04-06_p127_1 19
2016-04-01_p128_3 12
2016-04-02_p128_3 14
2016-04-03_p128_3 16
編輯:
如果重複,那麼有必要創建新列,排序和最後一滴他們:
df[[0,1,2]] = df.index.to_series().str.split('_', expand=True)
df[0] = pd.to_datetime(df[0])
df = df.sort_values(by=[1,2,0])
df = df.drop([0,1,2], axis=1)
print (df)
a
2015-03-28_p001_2 0
2015-03-29_p001_2 1
2015-03-30_p001_2 2
2015-03-31_p001_2 3
2015-04-01_p001_2 5
2015-04-02_p001_2 7
2015-04-03_p001_2 9
2015-03-31_p002_3 4
2015-04-01_p002_3 6
2015-04-02_p002_3 8
2016-03-31_p127_1 10
2016-04-01_p127_1 11
2016-04-02_p127_1 13
2016-04-03_p127_1 15
2016-04-04_p127_1 17
2016-04-05_p127_1 18
2016-04-06_p127_1 19
2016-04-01_p128_3 12
2016-04-02_p128_3 14
2016-04-03_p128_3 16
[Python,pandas:如何按索引對數據框進行排序]的可能重複(https://stackoverflow.com/questions/22211737/python-pandas-how-to-sort-dataframe-by-index) – Zero