2017-01-26 26 views
0

我正在推薦系統中工作。我已按照this按用戶矩陣製作。但是,我遇到了錯誤IndexError: index 8928358160 is out of bounds for axis 0 with size 5用戶按項目martrix pandas

以下是數據集示例。

import pandas as pd 
import numpy as np 

df = pd.read_csv('APRIL.csv') 
df = df.drop(['BASKETID'],1) 
df = df.head(10) 
df 
Out[89]: 
MEMBERID SKU QTY 
0 8928358161 37101163 2 
1 8928358161 36618858 1 
2 8928358161 40855129 1 
3 8933444371 35010078 1 
4 8932505053 36335949 1 
5 8932505053 92100668 1 
6 8932505053 36529730 2 
7 8921161362 61814893 1 
8 8915688100 34732853 1 
9 8915688100 35122457 1 


n_users = df.MEMBERID.unique().shape[0] 
n_items = df.SKU.unique().shape[0] 
print str(n_users) + ' users' 
print str(n_items) + ' items' 
5 users 
10 items 

ratings = np.zeros((n_users, n_items)) 
for row in df.itertuples(): 
    ratings[row[1]-1, row[2]-1] = row[3] 
ratings 
--------------------------------------------------------------------------- 
IndexError        Traceback (most recent call last) 
<ipython-input-92-0a393963bf4c> in <module>() 
     1 ratings = np.zeros((n_users, n_items)) 
     2 for row in df.itertuples(): 
----> 3  ratings[row[1]-1, row[2]-1] = row[3] 
     4 ratings 

IndexError: index 8928358160 is out of bounds for axis 0 with size 5 

我仍然沒有從那裏index 8928358160來理解。

回答

0

爲什麼不將值轉換爲字符串? 雖然它是一個整數,但計算機可能會將其作爲一個科學值,從而成爲一個浮點值。

試試這個:

轉換中的cust_id和成ITEM_NUMBER從浮點值的字符:

mergedfinal['cust_id'] = mergedfinal['cust_id'].astype(str) 
mergedfinal['item_number'] = mergedfinal['item_number'].astype(str) 
mergedfinal['SKU'] = mergedfinal['SKU'].astype(str) 

mergedfinal是我的數據框