2017-08-14 105 views
1

我有熊貓據幀像這樣的熊貓據幀以JSON格式操縱

User Category Rating 
1  [1,2,3]  [5,1,3] 
2  [3,2,1]  [3,1,1] 
3  [1,3,1]  [2,1,4] 

我想寫一個端點,需要一個用戶和返回的類別和等級的列表爲特定的用戶。

www.endpoint.com/user/1

應該返回

[{Category: 1, Rating: 5}, {Category: 2, Rating: 1}, {Category: 3, Rating: 3}]

有一個簡單的方法在大熊貓做到這一點?

回答

1

我會用the following generic function which explodes lists in columns into rows

def explode(df, lst_cols, fill_value=''): 
    # make sure `lst_cols` is a list 
    if lst_cols and not isinstance(lst_cols, list): 
     lst_cols = [lst_cols] 
    # all columns except `lst_cols` 
    idx_cols = df.columns.difference(lst_cols) 

    # calculate lengths of lists 
    lens = df[lst_cols[0]].str.len() 

    if (lens > 0).all(): 
     # ALL lists in cells aren't empty 
     return pd.DataFrame({ 
      col:np.repeat(df[col].values, df[lst_cols[0]].str.len()) 
      for col in idx_cols 
     }).assign(**{col:np.concatenate(df[col].values) for col in lst_cols}) \ 
      .loc[:, df.columns] 
    else: 
     # at least one list in cells is empty 
     return pd.DataFrame({ 
      col:np.repeat(df[col].values, df[lst_cols[0]].str.len()) 
      for col in idx_cols 
     }).assign(**{col:np.concatenate(df[col].values) for col in lst_cols}) \ 
      .append(df.loc[lens==0, idx_cols]).fillna(fill_value) \ 
      .loc[:, df.columns] 

演示:

In [88]: df 
Out[88]: 
    User Category  Rating 
0  1 [1, 2, 3] [5, 1, 3] 
1  2 [3, 2, 1] [3, 1, 1] 
2  3 [1, 3, 1] [2, 1, 4] 

In [89]: cols = ['Category','Rating'] 

In [90]: x = explode(df, cols) 

In [91]: x 
Out[91]: 
    User Category Rating 
0  1  1  5 
1  1  2  1 
2  1  3  3 
3  2  3  3 
4  2  2  1 
5  2  1  1 
6  3  1  2 
7  3  3  1 
8  3  1  4 

現在我們可以很容易地做你所需要的:

In [92]: x.loc[x.User == 1, cols].to_dict('r') 
Out[92]: 
[{'Category': '1', 'Rating': '5'}, 
{'Category': '2', 'Rating': '1'}, 
{'Category': '3', 'Rating': '3'}] 
0

這裏有一種方法

In [599]: func = lambda x: [{'Category':v, 'Rating': x.Rating[i]} 
          for i, v in enumerate(x.Category)] 

In [600]: func(df.loc[0]) 
Out[600]: 
[{'Category': 1, 'Rating': 5}, 
{'Category': 2, 'Rating': 1}, 
{'Category': 3, 'Rating': 3}] 

或者,適用於所有的行

In [598]: df.apply(func, 1).values 
Out[598]: 
array([[{'Category': 1, 'Rating': 5}, {'Category': 2, 'Rating': 1}, 
     {'Category': 3, 'Rating': 3}], 
     [{'Category': 3, 'Rating': 3}, {'Category': 2, 'Rating': 1}, 
     {'Category': 1, 'Rating': 1}], 
     [{'Category': 1, 'Rating': 2}, {'Category': 3, 'Rating': 1}, 
     {'Category': 1, 'Rating': 4}]], dtype=object)