我有一個數據幀:蟒蛇在數據幀相結合的行和加起來值
Type: Volume:
Q 10
Q 20
T 10
Q 10
T 20
T 20
Q 10
,我想T型結合起來,一個行並添加了體積只有兩個(或更多)TS是連續
即:
Q 10
Q 20
T 10
Q 10
T 20+20=40
Q 10
有沒有什麼辦法來實現這一目標? DataFrame.groupby
會工作嗎?
我有一個數據幀:蟒蛇在數據幀相結合的行和加起來值
Type: Volume:
Q 10
Q 20
T 10
Q 10
T 20
T 20
Q 10
,我想T型結合起來,一個行並添加了體積只有兩個(或更多)TS是連續
即:
Q 10
Q 20
T 10
Q 10
T 20+20=40
Q 10
有沒有什麼辦法來實現這一目標? DataFrame.groupby
會工作嗎?
我認爲這將有助於。此代碼可以處理任意數量的連續「T」,您甚至可以更改要組合的字符。我在代碼中添加了註釋以解釋它的功能。
import pandas as pd
def combine(df):
combined = [] # Init empty list
length = len(df.iloc[:,0]) # Get the number of rows in DataFrame
i = 0
while i < length:
num_elements = num_elements_equal(df, i, 0, 'T') # Get the number of consecutive 'T's
if num_elements <= 1: # If there are 1 or less T's, append only that element to combined, with the same type
combined.append([df.iloc[i,0],df.iloc[i,1]])
else: # Otherwise, append the sum of all the elements to combined, with 'T' type
combined.append(['T', sum_elements(df, i, i+num_elements, 1)])
i += max(num_elements, 1) # Increment i by the number of elements combined, with a min increment of 1
return pd.DataFrame(combined, columns=df.columns) # Return as DataFrame
def num_elements_equal(df, start, column, value): # Counts the number of consecutive elements
i = start
num = 0
while i < len(df.iloc[:,column]):
if df.iloc[i,column] == value:
num += 1
i += 1
else:
return num
return num
def sum_elements(df, start, end, column): # Sums the elements from start to end
return sum(df.iloc[start:end, column])
frame = pd.DataFrame({"Type": ["Q", "Q", "T", "Q", "T", "T", "Q"],
"Volume": [10, 20, 10, 10, 20, 20, 10]})
print(combine(frame))
非常感謝您的回覆。請問如果我得到的數據框超過2列,我怎麼才能更改這段代碼?我只想將一列的值加起來,並保持其餘的不變?即'Type'和'Volume',我得到'Type','Time','Volume'等,而我只想將'Volume'的值相加 – bing
將元素追加到組合列表('a')放在'df.iloc [i,col]'中,其中col是'時間'列的列索引。 'combined.append([df.iloc [i,0],df.iloc [i,1]])'成爲'combined.append([df.iloc [i,0],df.iloc [i,1] ,df.iloc [i,2]])'和'combined.append(['T',sum_elements(df,i,i + num_elements,1)])'''combined.append(['T', df.iloc [i,1],sum_elements(df,i,i + num_elements,2)])' – coolioasjulio
https://stackoverflow.com/questions/46099924/how-to-combine-consecutive-data-in-a -dataframe-和添加了價值 – bing
如果你只需要部分資金,這裏是一個小把戲做到這一點:
import numpy as np
import pandas as pd
df = pd.DataFrame({"Type": ["Q", "Q", "T", "Q", "T", "T", "Q"],
"Volume": [10, 20, 10, 10, 20, 20, 10]})
s = np.diff(np.r_[0, df.Type == "T"])
s[s < 0] = 0
res = df.groupby(("Type", np.cumsum(s) - 1)).sum().loc["T"]
print(res)
輸出:
Volume
0 10
1 40
這看起來像它可能開始解決您的問題https://stackoverflow.com/a/45679091/4365003 – RagingRoosevelt
我認爲這是一種不同的...我想行,而不是合併的計數他們 – bing
~~你不會只是使用不同的聚合函數,然後?~~ – RagingRoosevelt