2016-12-24 51 views
0

我DF矩陣如下所示:爲什麼我的df正在搜索列索引而不是檢查列名?

  rating 
id  10153337 10183250 10220967 ... 99808270 99816554 99821259 
user_id        ... 
10003869  NaN  8.0  NaN ...   NaN  NaN  NaN 
10022889  NaN  NaN  3.0 ...   NaN  1.0  NaN 

我不能得到我需要的,因爲它返回一個「索引超出界限」錯誤列

specificID = ratings_matrix[[99816554]] 
... 
    raise IndexError("indices are out-of-bounds") 
IndexError: indices are out-of-bounds 

爲什麼不搜索給列的值?

一些運行的代碼:

ratings = pd.read_json(
''.join(
    ['{"columns":["id","rating","user_id"],"index":[0,1,2],"data":[[', 
    '67728134,4,10003869],[57495823,9,10060085],[99816554,1,10022889]]}'] 
), orient='split') 

ratings 
ratings.dtypes 

ratings_matrix = ratings.pivot_table(index=['user_id'], columns=['id'], values=['rating']) 
ratings_matrix.columns.map(type) 
ratings_matrix[[67728134]] #here! searches column numbers rather than values 
+0

它幫助,如果你提供一個自包含的,可運行的例子證明的問題。 – BrenBarn

回答

4

注意,當你創建你的支點,你傳遞一個列表到values參數:

ratings_matrix = ratings.pivot_table(# |<--- here --->| 
    index=['user_id'], columns=['id'], values=['rating']) 

這跟大熊貓創造pd.MultiIndex。這就是爲什麼你必須在結果中使用rating的列數。


選項1
使用多指標

specificID = ratings_matrix[[('rating', 99816554)]] 

選項2
不創建多指標

ratings_matrix = ratings.pivot_table(# see what I did? 
    index=['user_id'], columns=['id'], values='rating') 

然後

specificID = ratings_matrix[[99816554]] 

設置

df = pd.read_json(
    ''.join(
     ['{"columns":["id","rating","user_id"],"index":[0,1,2],"data":[[', 
     '67728134,4,10003869],[57495823,9,10060085],[99816554,1,10022889]]}'] 
    ), orient='split' 
) 

df 

enter image description here

ratings_matrix = ratings.pivot_table(# |<--- here --->| 
    index=['user_id'], columns=['id'], values=['rating']) 
ratings_matrix[[('rating', 67728134)]] 

enter image description here

ratings_matrix = ratings.pivot_table(# see what I did? 
    index=['user_id'], columns=['id'], values='rating') 
ratings_matrix[[67728134]] 

enter image description here

+0

謝謝!這是我無法弄清的!正是我需要的。 – canada11

相關問題