熊貓比較基於條件

-1

dataframes並返回集中的行我有兩個dataframes：熊貓比較基於條件

[in] print(testing_df.head(n=5)) 
print(product_combos1.head(n=5)) 

[out] 
        product_id length 
transaction_id       
001      (P01,)  1 
002     (P01, P02)  2 
003    (P01, P02, P09)  3 
004     (P01, P03)  2 
005    (P01, P03, P05)  3 

      product_id count length 
0   (P06, P09) 36340  2 
1 (P01, P05, P06, P09) 10085  4 
2   (P01, P06) 36337  2 
3   (P01, P09) 49897  2 
4   (P02, P09) 11573  2

我想與是len(testing_df + 1)和包含在其中的testing_df串的最高頻率返回product_combos行。所以例如，transaction_id 001我想返回product_combos[3]（只有P09雖然）。

對於第一部分（製作完全基於長度的比較）我想：

# Return the product combos values that are of the appropriate length and the strings match 
for i in testing_df['length']: 
    for k in product_combos1['length']: 
     if (i)+1 == (k): 
      matches = list(k)

然而，這將返回錯誤：

TypeError: 'numpy.int64' object is not iterable

來源

2017-08-05 zsad512

不能創建從一個列表像這樣不可迭代。嘗試用matches = [k]替換matches = list(k)。另外這些括號是多餘的 - 您可以用if i + 1 == k:替換if (i)+1 == (k):。

來源

2017-08-05 16:46:59 vahndi

只需使用.append（）方法。我還建議將'匹配'設置爲頂部的空白列表，以便在重新運行單元格時不會出現重複。

# Setup 

testing_df = pd.DataFrame(columns = ['product_id','length']) 
testing_df.product_id = [('P01',),('P01', 'P02')] 
testing_df.length = [1,2] 
product_combos1 = pd.DataFrame(columns = ['product_id','count','length']) 
product_combos1.length = [3,1] 
product_combos1.product_id = [('P01',),('P01', 'P02')] 
product_combos1.count = [100,5000] 

# Matching 

matches = [] 
for i in testing_df['length']: 
    for k in product_combos1['length']: 
     if i+1 == k: 
      matches.append(k)

讓我知道這是否有效，或者如果還有其他東西！祝你好運！

來源

2017-08-05 16:49:48 CalendarJ

謝謝，但不幸的是，這並沒有奏效 - 但是我能夠用另一種方法解決問題。 – zsad512

我很抱歉聽到！在我給出的示例設置的筆記本上它運行良好。很高興聽到你能解決這個問題！當你有機會時，請記得將它作爲答案發布，以便其他人來到這篇文章可以參考。 – CalendarJ

熊貓比較基於條件

回答

相關問題