如果需要，比較行然後取出行

我有一個示例數據框，如下所示。如果需要，比較行然後取出行

df = pd.DataFrame({ 'Area' : ['1', '2', '3', '4','5', '6', '7', '8', '9', '10'], 
        'Distance' : ['19626207', '20174412', '20175112', '19396352', 
        '19391124', '19851396', '19221462', '20195112', '21127633', '19989793'], 
        }) 

    Area Distance 
0 1 19626207 
1 2 20174412 
2 3 20175112 
3 4 19396352 # smaller, take out 
4 5 19391124 # 
5 6 19851396 # 
6 7 19221462 # 
7 8 20195112 
8 9 21127633 
9 10 19989793 #

'距離'列需要按升序排序。

但是數據幀的順序是固定的（的「區域」命令是不多變）

這意味着，如果行是比以前的行小，則需要被取出的行。例如，這是我想看到的結果。

Area Distance 
    1 19626207 
    2 20174412 
    3 20175112 
    8 20195112 
    9 21127633

我知道我可以嘗試像for i in range(0, len(index), 1) ......

但有esaier方式來實現用熊貓的目標是什麼？

有什麼提示嗎？

來源

2016-04-25 Sakura

UPDATE2：這裏的ayhan解決方案，將工作正確：

In [135]: df[df.Distance.astype("int64")>=df.Distance.astype("int64").cummax()] 
Out[135]: 
    Area Distance 
0 1 19626207 
1 2 20174412 
2 3 20174412 
7 8 20195112 
8 9 21127633

UPDATE：

以下解決方案將不始終正常工作，因爲它會刪除全部重複。所以如果你在原來的DF中會有重複的值，它們就會消失。

下面是一個例子：

In [122]: df 
Out[122]: 
    Area Distance 
0 1 19626207 
1 2 20174412 # duplicates 
2 3 20174412 # they should BOTH be in the result set 
3 4 19396352 
4 5 19391124 
5 6 19851396 
6 7 19221462 
7 8 20195112 
8 9 21127633 
9 10 19989793 

In [123]: df.loc[df.Distance.cummax().drop_duplicates().index] 
Out[123]: 
    Area Distance 
0 1 19626207 
1 2 20174412 # one duplicate has been dropped 
7 8 20195112 
8 9 21127633

PS我會盡力找到一個有效的解決方案

OLD答案：

我不知道它是否是最有效的方法，但它的作品：

In [94]: df.loc[df.Distance.cummax().drop_duplicates().index] 
Out[94]: 
    Area Distance 
0 1 19626207 
1 2 20174412 
2 3 20175112 
7 8 20195112 
8 9 21127633

Explanat離子：

In [98]: df.Distance.cummax() 
Out[98]: 
0 19626207 
1 20174412 
2 20175112 
3 20175112 
4 20175112 
5 20175112 
6 20175112 
7 20195112 
8 21127633 
9 21127633 
Name: Distance, dtype: object

來源

2016-04-25 19:18:54 MaxU

我想你可以檢查當前行是否大於cummax。 'df [df.Distance.astype（「int64」）> = df.Distance.astype（「int64」）。cummax（）]' – ayhan

@ayhan，就是這樣！請張貼它作爲答案 - 這是你的解決方案，它比我的更好。 – MaxU

我認爲主要想法是'cummax' - 你想出了，重複是一個小細節，所以我認爲如果你編輯你的答案會更好。 :) – ayhan

如果需要，比較行然後取出行

回答

相關問題