Python：通過比較不同的數據幀列來獲得一個值列

我需要一個幫助原因我試圖通過比較不同的數據幀列來獲得一個值。Python：通過比較不同的數據幀列來獲得一個值列

首先，我試過使用「for循環」來達到目標，但是我有數百萬行，所以需要很多時間。現在，我想用numpy.where，用這種方法：

我有2個數據幀： -df1其中每行都不同於其他（列ID是唯一的主鍵） - > df1 ['ID'，'status'，'boolean'] - df2包含幾行，每行都與其他行不同 - > df2 ['code'，'segment'，'value']

現在，我需要爲dataframe1創建一個名爲'weight'的新列。

我想以這種方式來創建欄「權重」：

df1['weight'] = numpy.where(df1['boolean'] == 1, df2[ (df2['code']==df1['ID']) & (df2['segment']==df1['status'])] ['value'], 0)

列「代碼」 +「段」是一個獨特的密鑰，所以它返回一個且只有一個值。

程序執行顯示這個錯誤：「ValueError異常：只能比較相同標記的一系列對象」

誰能幫我明白了嗎？

謝謝。

來源

2017-08-03 josè

一個時刻，我想給你一些例子... –

你可以用左join

像這樣的東西可能會奏效做到這一點。如果沒有樣本數據詳細

df_merged = df1.join(df2.set_index(['code', 'segment']), how='left', on=['ID', 'status']) 
df1['weight'] = df_merged['value'].re_index(df1.index).fillna(0)

的set_index()我無法檢查，這是需要

on : column name, tuple/list of column names, or array-like 
Column(s) in the caller to join on the index in other, otherwise joins index-on-index. If multiples columns given, the passed DataFrame must have a MultiIndex. Can pass an array as the join key if not already contained in the calling DataFrame. Like an Excel VLOOKUP operation

來源

2017-08-03 14:55:36

Python：通過比較不同的數據幀列來獲得一個值列

回答

相關問題