熊貓從指定的行號迭代通過行

我想通過從特定行號開始遍歷行讀取熊貓數據框中的數據。我知道有df.iterrows()，但它不讓我指定從哪裏開始迭代。熊貓從指定的行號迭代通過行

在我的特定情況下，我有可能會是這個樣子的CSV文件：

Date, Temperature 
21/08/2017 17:00:00,5.53 
21/08/2017 18:00:00,5.58 
21/08/2017 19:00:00,4.80 
21/08/2017 20:00:00,4.59 
21/08/2017 21:00:00,3.72 
21/08/2017 22:00:00,3.95 
21/08/2017 23:00:00,3.11 
22/08/2017 00:00:00,3.07 
22/08/2017 01:00:00,2.80 
22/08/2017 02:00:00,2.75 
22/08/2017 03:00:00,2.79 
22/08/2017 04:00:00,2.76 
22/08/2017 05:00:00,2.76 
22/08/2017 06:00:00,3.06 
22/08/2017 07:00:00,3.88

我想遍歷每個行從一個特定的時間點上（讓我們說8月22日的午夜），所以我想實現這樣的：

df = pandas.read_csv('file.csv') 
start_date = '22/08/2017 00:00:00' 

// since it's sorted, I figured I could use binary search 
result = pandas.Series(df['Date']).searchsorted(start_date)

result[0]居然給了我正確的號碼。

我想我可以做的只是增加這個數字，並通過df.iloc[[x]]訪問該行，但我覺得這樣做很髒。

for x in range(result[0], len(df)): 
    row = df.loc[[x]]

我到目前爲止發現的所有答案只顯示如何遍歷整個表。

來源

2017-08-31 Felix Jassler

只是過濾您的數據幀調用iterrows()前：

df['Date'] = pandas.to_datetime(df['Date']) 
for idx, row in df[df['Date'] >= '2017-08-22'].iterrows(): 
    # 
    # Whatever you want to do in the loop goes here 
    #

請注意，這是沒有必要的過濾參數轉換'2017-08-22'到一個datetime的對象，因爲熊貓可以處理partial string indexing。

來源

2017-08-31 21:16:45 kev8484

+1，因爲即使我查找的確切日期時間不在表格中，它也能正常工作。只是提醒一下，它會按字母順序比較字符串 - 如果轉換爲日期時間，則工作正常。 –

我假設你在談論'Date'這個列是datetime對象，而不是一個字符串。你是對的，我是假設的。我會更新帖子。 – kev8484

應該更具體一點。謝謝你的幫助 –

將Date轉換爲datetime。設置Date爲index：

df.Date = pd.to_datetime(df.Date) 

df = df.set_index('Date')

然後：

for date, row in df['22/08/2017 00:00:00':].iterrows(): 
    print(date.strftime('%c'), row.squeeze()) 

Tue Aug 22 00:00:00 2017 3.07 
Tue Aug 22 01:00:00 2017 2.8 
Tue Aug 22 02:00:00 2017 2.75 
Tue Aug 22 03:00:00 2017 2.79 
Tue Aug 22 04:00:00 2017 2.76 
Tue Aug 22 05:00:00 2017 2.76 
Tue Aug 22 06:00:00 2017 3.06 
Tue Aug 22 07:00:00 2017 3.88

來源

2017-08-31 21:13:33 piRSquared

哦，整齊。沒想到只是把桌子切成片。 –

熊貓從指定的行號迭代通過行

回答

相關問題