2015-09-22 75 views
1

我有一個簡單的問題。我有以下數據幀熊貓Python:如何在每10步獲取數據幀的值?

df = 
    time          lat   lon 
    0 2014-03-26 14:46:27.457233+00:00 48.7773  11.428897 
    1 2014-03-26 14:46:28.457570+00:00 48.7773  11.428719 
    2 2014-03-26 14:46:29.457665+00:00 48.7772  11.428542 
    3 2014-03-26 14:46:30.457519+00:00 48.7771  11.428368 
    4 2014-03-26 14:46:31.457855+00:00 48.7770  11.428193 
    5 2014-03-26 14:46:32.457950+00:00 48.7770  11.428018 
    6 2014-03-26 14:46:33.457794+00:00 48.7769  11.427842 
    7 2014-03-26 14:46:34.458131+00:00 48.7768  11.427668 
    8 2014-03-26 14:46:35.458246+00:00 48.7767  11.427501 
    9 2014-03-26 14:46:36.458069+00:00 48.7766  11.427350 
    10 2014-03-26 14:46:37.458416+00:00 48.7766  11.427224 
    11 2014-03-26 14:46:38.458531+00:00 48.7765  11.427129 
    12 2014-03-26 14:46:39.458355+00:00 48.7764  11.427062 
    13 2014-03-26 14:46:40.458702+00:00 48.7764  11.427011 
    14 2014-03-26 14:46:41.458807+00:00 48.7764  11.426963 
    15 2014-03-26 14:46:42.458640+00:00 48.7763  11.426918 
    16 2014-03-26 14:46:43.458977+00:00 48.7763  11.426872 
    17 2014-03-26 14:46:44.459102+00:00 48.7762  11.426822 
    18 2014-03-26 14:46:45.458926+00:00 48.7762  11.426766 
    19 2014-03-26 14:46:46.459262+00:00 48.7761  11.426702 
    20 2014-03-26 14:46:47.459378+00:00 48.7760  11.426628 

我想生成一個新的數據幀df1包含每10個時間步的值。

df1 = 
     time          lat   lon 
     0  2014-03-26 14:46:27.457233+00:00 48.7773  11.428897 
     9  2014-03-26 14:46:46.459262+00:00  48.7761  11.426702 
     19  2014-03-26 14:46:46.459262+00:00 48.7765  11.426787 
     ...  ...   ...     ...  .... 
     len(df) 2014-03-26 14:46:46.459262+00:00 48.7765  11.426787 

我嘗試做一些像

df1 = df.iloc[[0:10:len(df)]] 
+1

我不明白你所需的輸出。你有索引0,9,19,它先上升9,然後上升10.爲什麼不是0,10,20(增加10)或0,9,18(增加9)? – DSM

+0

你的切片方法是正確的想法,幾乎是正確的:使用'df.iloc [:: 10]'獲得每十行。 (我強烈建議你*不要*循環索引。) –

回答

0

如何df.loc[[i for j, i in enumerate(df.index) if j % 10 == 0]]

+0

完美,這是我一直在尋找的。非常感謝。 – emax

6

只需用切片的iloc DF並通過了一步PARAM,切片行爲可以解釋here但基本上是第三個參數是步長:

In [67]: 
df = pd.DataFrame(np.random.randn(100,2)) 
df.iloc[::10] 

Out[67]: 
      0   1 
0 0.552160 -0.910893 
10 -2.173707 -0.659227 
20 0.811937 0.675416 
30 0.533533 0.336104 
40 1.093083 -0.943157 
50 -0.559221 0.272763 
60 -0.011628 1.002561 
70 -0.114501 0.457626 
80 1.355948 0.236342 
90 -0.151979 -0.746238 
+0

謝謝。它也可以工作。 – emax