2014-04-03 150 views
0

我有一個簡單的Pandas DataFrame,包含行名和2列,有點像下面這樣。按行名分組熊貓DataFrame

from pandas import DataFrame, Series 
row_names = ['row1', 'row2', 'row2', 'row4'] 
df = DataFrame({'col1': Series([1, 2, 3, 4], index=row_names), 
       'col2': Series([0, 1, 0, 1], index=row_names)}) 

與上面的例子一樣,重複一些行名稱。我想按行名對DataFrame進行分組,這樣我就可以按組執行聚合操作(例如count,mean)。

舉例來說,我可能我dfrow2出現一次想找出row1row4出現一次每英寸

我知道groupby方法,但從我在網上看到的例子中,它只按列值進行分組,而不是按行名進行分組。是這樣嗎?我應該讓我的rownames成爲DataFrame中的一列嗎?

回答

1

檢查文檔字符串(如果你使用IPython,它只是df.groupby?<enter>

Group series using mapper (dict or key function, apply given function 
to group, return result as series) or by a series of columns 

Parameters 
---------- 
by : mapping function/list of functions, dict, Series, or tuple/
    list of column names. 
    Called on each element of the object index to determine the groups. 
    If a dict or Series is passed, the Series or dict VALUES will be 
    used to determine the groups 
axis : int, default 0 
level : int, level name, or sequence of such, default None 
    If the axis is a MultiIndex (hierarchical), group by a particular 
    level or levels 
... 

你想要level說法:

In [20]: df.groupby(level=0).count() 
Out[20]: 
     col1 col2 
row1  1  1 
row2  2  2 
row4  1  1 

[3 rows x 2 columns]