Python的CSV：如何從數據幀提取與條件數據，編輯所提取的數據，然後把它放回數據幀

示例CSV數據：Python的CSV：如何從數據幀提取與條件數據，編輯所提取的數據，然後把它放回數據幀

ID,AC_Input_Voltage,AC_Input_Current,DC_Output_Voltage,DC_Output_Current,DC_Output_Power,Input_Active_Power,Input_Reactive_Power,Input_Apparent_Power,Line_Frequency,DC_Ref,AC_Ref,Time_Stamp 
8301,418,13.2,34.4,136,4673,1,-1,5524.5,0,49,0,22/6/2017 05:11:00 
8301,419.3,2.3,0.7,-0.9,-0.6,1,-1,946.2,0,50,0,22/6/2017 05:11:01 
8301,417.7,15.2,30.3,196.5,5962,1,-1,6355,0,49,0,22/6/2017 05:11:02 
8301,418.7,2.3,0.7,-0.9,-0.6,1,-1,944.7,0,50,0,22/6/2017 05:11:03 
8301,419.3,3.4,53.6,10.8,580.2,1,-1,1432.8,0,49,0,22/6/2017 05:11:04 
8301,417.7,13.6,30.1,170.4,5122.7,1,-1,5681.8,0,50,0,22/6/2017 05:11:05 
8301,418,11.5,41.2,105,4328.2,1,-1,4796.9,0,49,0,22/6/2017 05:11:07 
8301,419.7,2.3,0.8,-0.9,-0.7,1,-1,946.9,0,51,0,22/6/2017 05:11:08 
8301,419.7,2.3,40.6,-0.7,-27.9,1,-1,974,0,49,0,22/6/2017 05:11:09 
8301,417.4,14.9,30.4,194.4,5903.8,1,-1,6215.4,0,51,0,22/6/2017 05:11:10 
8301,417.7,14.7,30.5,186.2,5682.9,1,-1,6139.5,0,49,0,22/6/2017 05:11:11 
8301,418,12,31.5,141.5,4456.9,1,-1,5012.5,0,51,0,22/6/2017 05:11:12 
8301,419,2.3,0.7,-1.4,-0.9,1,-1,945.4,0,49,0,22/6/2017 05:11:13 
8301,419,2.3,0.7,-0.9,-0.6,1,-1,945.4,0,50,0,22/6/2017 05:11:14 
8301,419.7,2.3,0.8,-0.9,-0.7,1,-1,946.9,0,50,0,22/6/2017 05:11:15 
8301,419,2.3,0.7,-0.9,-0.6,1,-1,945.4,0,49,0,22/6/2017 05:11:16 
8301,419,2.3,32.9,-0.2,-5.7,1,-1,972.4,0,51,0,22/6/2017 05:11:17 
8301,419.3,2.3,50.3,0.3,17.3,1,-1,973.2,0,49,0,22/6/2017 05:11:18 
8301,417.4,15.2,30.5,197.4,6010.5,1,-1,6350,0,50,0,22/6/2017 05:11:19 
8301,418.7,2.3,0.9,-0.9,-0.7,1,-1,944.7,0,49,0,22/6/2017 05:11:20 
8301,419,2.3,42.9,-0.2,-7.4,1,-1,972.4,0,50,0,22/6/2017 05:11:21 
8301,417.4,13.9,30.4,180,5477.6,1,-1,5811.8,0,49,0,22/6/2017 05:11:22 
8301,419.7,2.3,0.9,-0.9,-0.8,1,-1,946.9,0,50,0,22/6/2017 05:11:23 
8301,418.7,2.3,0.7,-0.9,-0.6,1,-1,944.7,0,50,0,22/6/2017 05:11:24 
8301,418.3,2.3,0.6,-0.9,-0.5,1,-1,943.9,0,49,0,22/6/2017 05:11:25

我試過下面的代碼和管理的數據進行編輯然後把它們放入一個新的數據框（df_filter2）：

import numpy as np 
from datetime import date,time,datetime 
import pandas as pd 
import csv 

df = pd.read_csv('Data.csv') 
df["Time_Stamp"] = pd.to_datetime(df["Time_Stamp"]) # convert to Datetime 

def getMask(start,end): 
    mask = (df['Time_Stamp'] > start) & (df['Time_Stamp'] <= end) 
    return mask; 

start = '2017-06-22 05:00:00' 
end = '2017-06-22 05:20:00' 
timerange = df.loc[getMask(start, end)] 

df_filter = timerange[timerange["AC_Input_Current"].le(3.0)] # new df with less or equal to 0.5 
#print(df_filter) 

where = (df_filter[df_filter["Time_Stamp"].diff().dt.total_seconds() > 1] ["Time_Stamp"] - pd.Timedelta("1s")).astype(str).tolist() # Find where diff > 1 second 
df_filter2 = timerange[timerange["Time_Stamp"].isin(where)] # Create new df with those 
#print(df_filter2) 
df_filter2["AC_Input_Current"] = 0.0 # Set c1 to 0.0 

#display spikes (high possibility of data being a spike) 
for index, row in df_filter2.iterrows(): 
    values = row.astype(str).tolist() 
    print(','.join(values))

輸出：注：下面的編輯行是在數據幀df_filter2 ..

8301,418.0,0.0,34.4,136.0,4673.0,1,-1,5524.5,0,49,0,2017-06-22 05:11:00 
8301,417.7,0.0,30.3,196.5,5962.0,1,-1,6355.0,0,49,0,2017-06-22 05:11:02 
8301,418.0,0.0,41.2,105.0,4328.2,1,-1,4796.9,0,49,0,2017-06-22 05:11:07 
8301,418.0,0.0,31.5,141.5,4456.9,1,-1,5012.5,0,51,0,2017-06-22 05:11:12 
8301,417.4,0.0,30.5,197.4,6010.5,1,-1,6350.0,0,50,0,2017-06-22 05:11:19 
8301,417.4,0.0,30.4,180.0,5477.6,1,-1,5811.8,0,49,0,2017-06-22 05:11:22

我想是從df_filter2放回輸出（從df_filter2）到主數據幀df，更換行從df與同Time_Stamp，與行。我該怎麼做呢？

來源

2017-08-25 Sancta Ignis

將Time_Stamp設爲兩個數據幀的索引，然後根據匹配索引將df指定爲df_filter2值。

首先，確保兩個數據幀具有相同格式的Time_Stamp以及相同的列名稱。對於所提供的樣本數據，我用：

# copy df sample data from OP 
df = pd.read_clipboard(sep=",", parse_dates=["Time_Stamp"]) 
# now copy df_filter2 sample data 
df_filter2 = pd.read_clipboard(sep=",", header=None, names=df.columns, parse_dates=[12])

現在，設置Time_Stamp爲索引和更換匹配的行：

df = df.set_index("Time_Stamp") 
df_filter2 = df_filter2.set_index("Time_Stamp") 
df.loc[df_filter2.index] = df_filter2

UPDATE（每評論）
要明確，這裏是一個完整的工作示例，從data字典開始，編寫df，並使用OP代碼生成df_filter2。只做了輕微的修改（例如在原始data中將Time_Stamp定義爲pd.Timestamp，並在地點添加.loc）。

# sample data 
import pandas as pd 
from pandas import Timestamp 

data = {'AC_Input_Current': {0: 13.199999999999999, 1: 2.2999999999999998,2: 15.199999999999999,3: 2.2999999999999998,4: 3.3999999999999999,5: 13.6,6: 11.5,7: 2.2999999999999998,8: 2.2999999999999998,9: 14.9,10: 14.699999999999999,11: 12.0,12: 2.2999999999999998,13: 2.2999999999999998,14: 2.2999999999999998,15: 2.2999999999999998,16: 2.2999999999999998,17: 2.2999999999999998,18: 15.199999999999999,19: 2.2999999999999998,20: 2.2999999999999998,21: 13.9,22: 2.2999999999999998,23: 2.2999999999999998,24: 2.2999999999999998}, 
'AC_Input_Voltage': {0: 418.0,1: 419.30000000000001,2: 417.69999999999999,3: 418.69999999999999,4: 419.30000000000001,5: 417.69999999999999,6: 418.0,7: 419.69999999999999,8: 419.69999999999999,9: 417.39999999999998,10: 417.69999999999999,11: 418.0,12: 419.0,13: 419.0,14: 419.69999999999999,15: 419.0,16: 419.0,17: 419.30000000000001,18: 417.39999999999998,19: 418.69999999999999,20: 419.0,21: 417.39999999999998,22: 419.69999999999999,23: 418.69999999999999,24: 418.30000000000001}, 
'DC_Output_Current': {0: 136.0,1: -0.90000000000000002,2: 196.5,3: -0.90000000000000002,4: 10.800000000000001,5: 170.40000000000001,6: 105.0,7: -0.90000000000000002,8: -0.69999999999999996,9: 194.40000000000001,10: 186.19999999999999,11: 141.5,12: -1.3999999999999999,13: -0.90000000000000002,14: -0.90000000000000002,15: -0.90000000000000002,16: -0.20000000000000001,17: 0.29999999999999999,18: 197.40000000000001,19: -0.90000000000000002,20: -0.20000000000000001,21: 180.0,22: -0.90000000000000002,23: -0.90000000000000002,24: -0.90000000000000002}, 
'DC_Output_Power': {0: 4673.0,1: -0.59999999999999998,2: 5962.0,3: -0.59999999999999998,4: 580.20000000000005,5: 5122.6999999999998,6: 4328.1999999999998,7: -0.69999999999999996,8: -27.899999999999999,9: 5903.8000000000002,10: 5682.8999999999996,11: 4456.8999999999996,12: -0.90000000000000002,13: -0.59999999999999998,14: -0.69999999999999996,15: -0.59999999999999998,16: -5.7000000000000002,17: 17.300000000000001,18: 6010.5,19: -0.69999999999999996,20: -7.4000000000000004,21: 5477.6000000000004,22: -0.80000000000000004,23: -0.59999999999999998,24: -0.5}, 
'DC_Output_Voltage': {0: 34.399999999999999,1: 0.69999999999999996,2: 30.300000000000001,3: 0.69999999999999996,4: 53.600000000000001,5: 30.100000000000001,6: 41.200000000000003,7: 0.80000000000000004,8: 40.600000000000001,9: 30.399999999999999,10: 30.5,11: 31.5,12: 0.69999999999999996,13: 0.69999999999999996,14: 0.80000000000000004,15: 0.69999999999999996,16: 32.899999999999999,17: 50.299999999999997,18: 30.5,19: 0.90000000000000002,20: 42.899999999999999,21: 30.399999999999999,22: 0.90000000000000002,23: 0.69999999999999996,24: 0.59999999999999998}, 
'DC_Ref': {0: 49,1: 50,2: 49,3: 50,4: 49,5: 50,6: 49,7: 51,8: 49,9: 51,10: 49,11: 51,12: 49,13: 50,14: 50,15: 49,16: 51,17: 49,18: 50,19: 49,20: 50,21: 49,22: 50,23: 50,24: 49}, 
'Input_Apparent_Power': {0: 5524.5,1: 946.20000000000005,2: 6355.0,3: 944.70000000000005,4: 1432.8,5: 5681.8000000000002,6: 4796.8999999999996,7: 946.89999999999998,8: 974.0,9: 6215.3999999999996,10: 6139.5,11: 5012.5,12: 945.39999999999998,13: 945.39999999999998,14: 946.89999999999998,15: 945.39999999999998,16: 972.39999999999998,17: 973.20000000000005,18: 6350.0,19: 944.70000000000005,20: 972.39999999999998,21: 5811.8000000000002,22: 946.89999999999998,23: 944.70000000000005,24: 943.89999999999998}, 
'Time_Stamp': {0: Timestamp('2017-06-22 05:11:00'),1: Timestamp('2017-06-22 05:11:01'),2: Timestamp('2017-06-22 05:11:02'),3: Timestamp('2017-06-22 05:11:03'),4: Timestamp('2017-06-22 05:11:04'),5: Timestamp('2017-06-22 05:11:05'),6: Timestamp('2017-06-22 05:11:07'),7: Timestamp('2017-06-22 05:11:08'),8: Timestamp('2017-06-22 05:11:09'),9: Timestamp('2017-06-22 05:11:10'),10: Timestamp('2017-06-22 05:11:11'),11: Timestamp('2017-06-22 05:11:12'),12: Timestamp('2017-06-22 05:11:13'),13: Timestamp('2017-06-22 05:11:14'),14: Timestamp('2017-06-22 05:11:15'),15: Timestamp('2017-06-22 05:11:16'),16: Timestamp('2017-06-22 05:11:17'),17: Timestamp('2017-06-22 05:11:18'),18: Timestamp('2017-06-22 05:11:19'),19: Timestamp('2017-06-22 05:11:20'),20: Timestamp('2017-06-22 05:11:21'),21: Timestamp('2017-06-22 05:11:22'),22: Timestamp('2017-06-22 05:11:23'),23: Timestamp('2017-06-22 05:11:24'),24: Timestamp('2017-06-22 05:11:25')}} 
df = pd.DataFrame(data)

有具有恆定值的幾列：

df["AC_Ref"] = 0 
df["ID"] = 8301 
df["Input_Active_Power"] = 1 
df["Input_Reactive_Power"] = -1 
df["Line_Frequency"] = 0

現在構造df_filter2：

def getMask(start,end): 
    mask = (df['Time_Stamp'] > start) & (df['Time_Stamp'] <= end) 
    return mask; 
start = '2017-06-22 05:00:00' 
end = '2017-06-22 05:20:00' 
timerange = df.loc[getMask(start, end)] 
df_filter = timerange.loc[timerange["AC_Input_Current"].le(3.0)] 
where = (df_filter.loc[df_filter["Time_Stamp"].diff().dt.total_seconds() > 1, "Time_Stamp"] - pd.Timedelta("1s")).astype(str).tolist() 
df_filter2 = timerange.loc[timerange["Time_Stamp"].isin(where)].copy() 
df_filter2["AC_Input_Current"] = 0.0

最後，與匹配的行從df_filter2替換行中df（由Time_Stamp）：

df = df.set_index("Time_Stamp") 
df_filter2 = df_filter2.set_index("Time_Stamp") 
df.loc[df_filter2.index] = df_filter2

我們可以檢查，以確保發生更換：

assert(all(df.AC_Input_Current.sort_values()[:5].values == df_filter2.AC_Input_Current.values))

來源

2017-08-25 05:02:02

而不是使用'pd.read_clipboard'，如果我是對我們'read_csv'爲'DF = df.read_csv（'MainD2的。 csv'，sep ='，'，parse_dates = [「Time_Stamp」]），如何更改'df_filter2'？對不起，我仍然在學python，所以是.. –

你的代碼已經生成了'df_filter2'，一旦你有'df'。只要使用它。 –

我試過你給出的代碼，我去檢查'df'中的數據，看看'df_filter2'，'AC_Input_Current'值被設置爲0的行是否在'df'中。。顯然它不是。因爲'AC_Input_Current'值仍然是'2.3' –

Python的CSV：如何從數據幀提取與條件數據，編輯所提取的數據，然後把它放回數據幀

回答

相關問題