1
我有這段代碼,它應該適合您。首先添加行值,然後根據標籤的子集丟棄重複行
import pandas as pd
import urllib.request
import numpy as np
url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20170811_da_bc.xls"
cnstxls = urllib.request.urlopen(url)
xl = pd.ExcelFile(cnstxls)
df = xl.parse("Sheet1",skiprows=3)
constr = df.iloc[:,1:7]
constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]), 'Offpeak', 'Onpeak')
summary=pd.pivot_table(constr, index=['Constraint_ID','Constraint Name'],values='Shadow Price', columns=['Class'], aggfunc=np.sum)
下面是上面的代碼生成所述輸出的所述部分:
Class Offpeak Onpeak
Constraint_ID Constraint Name
1049 EAU CLA TR9 FLO EAU CLR XF10 -46.52 -364.68
1607 OTTUMWA-WAPELLO FLO HILLS-MONTEZUM -2.60 -237.36
285770 DKSN-MATTHSON FLO BELFLD-CHRLIE CK NaN -59.53
MATTHSON MATTHDKSN_11_1 1 WAU34028 43.66 NaN
MATTHSON_MATTHDKSN_11_1_1_LN 6.55 NaN
287090 BAKER2_TR11_TR11_XF 11.78 1.63
289484 BAKER2 TR12 TR12 WAUMDU13 NaN -4.52
BAKER2_TR12_TR12_XF -10.41 NaN
欲實現如下:
1) Add the values for Offpeak and Onpeak columns where Constraint_ID is same. For example:Constraint_ID=285770 has three different Constraint Names and corresponding values.
2) Drop duplicate Constraint_IDs keeping the first Constraint Name
3) Create a third column that adds OffPeak and Onpeak
任何援助不勝感激。
所期望的輸出? '1)'是否代表'NaN's? – jezrael