2017-03-29 223 views
-1

我有數據幀熊貓:GROUPBY一些數據

datetime city state country shape duration (seconds) duration (hours/min) comments date posted latitude longitude 
10/10/1949 20:30 san marcos tx us cylinder 2700 45 minutes This event took place in early fall around 1949-50. It occurred after a Boy Scout meeting in the Baptist Church. The Baptist Church sit 4/27/2004 29.8830556 -97.9411111 
10/10/1949 21:00 lackland afb tx  light 7200 1-2 hrs 1949 Lackland AFB&#44 TX. Lights racing across the sky & making 90 degree turns on a dime. 12/16/2005 29.38421 -98.581082 
10/10/1955 17:00 chester (uk/england)  gb circle 20 20 seconds Green/Orange circular disc over Chester&#44 England 1/21/2008 53.2 -2.916667 
10/10/1956 21:00 edna tx us circle 20 1/2 hour My older brother and twin sister were leaving the only Edna theater at about 9 PM&#44...we had our bikes and I took a different route home 1/17/2004 28.9783333 -96.6458333 
10/10/1960 20:00 kaneohe hi us light 900 15 minutes AS a Marine 1st Lt. flying an FJ4B fighter/attack aircraft on a solo night exercise&#44 I was at 50&#44000&#39 in a "clean" aircraft (no ordinan 1/22/2004 21.4180556 -157.8036111 

我嘗試state 做組我用

result = df.groupby("state").\ 
    agg({"state": pd.Series.nunique, "duration (seconds)": np.sum}).\ 
    rename(columns={"state": "frequency", "duration (seconds)": "whole time"}).\ 
    reset_index() 

但它返回錯誤TypeError: must be str, not float。 我嘗試轉換duration (seconds),但它返回 duration (seconds)。 我該如何檢查這個問題?

+0

什麼實際引發錯誤? (「state」)。agg({「state」:pd.Series.nunique})'工作嗎? (即,你的groupby的一半) – Stael

+0

我們不知道錯誤來自哪裏。整個錯誤兄弟 –

回答

0

做這樣的事情:

# Group df by df.state, then apply a sum lambda function to df.duration(seconds) 
df.groupby('state')['duration (seconds)'].apply(lambda x:x.mean()) 

或者,如果你想有一個滾動的總和:

df.groupby('state')['duration (seconds)'].apply(lambda x:x.rolling(center=False,window=2).sum())