如何根據謂詞聲明來聚合熊貓系列值？

在R，很容易聚合值和應用功能（在這種情況下，sum）如何根據謂詞聲明來聚合熊貓系列值？

> example <- c(a1=1,a2=2,b1=3,b2=4) 
> example # this is the vector (equivalent to Series) 
a1 a2 b1 b2 
1 2 3 4 
> grepl("^a",names(example)) #predicate statement 
[1] TRUE TRUE FALSE FALSE 
> sum(example[grep("^a",names(example))]) #combined into one statement 
[1] 3

我能想到的大熊貓這樣做的方法是使用列表理解，而不是任何量化的大熊貓功能：

In [55]: example = pd.Series({'a1':1,'a2':2,'b1':3,'b2':4}) 

In [56]: example 
Out[56]: 
a1 1 
a2 2 
b1 3 
b2 4 
dtype: int64 

In [63]: sum([example[x] for x in example.index if re.search('^a',x)]) 
Out[63]: 3

在熊貓中是否有等價的矢量化方法？

來源

2013-09-16 hatmatrix

在大熊貓v0.12.0你可以轉換Index到Series和使用str.contains搜索字符串。

In [12]: s[s.index.to_series().str.contains('^a')].sum() 
Out[12]: 3

在v0.13.0使用Series.filter方法：

In [6]: s = Series([1,2,3,4], index=['a1','a2','b1','b2']) 

In [7]: s.filter(regex='^a') 
Out[7]: 
a1 1 
a2 2 
dtype: int64 

In [8]: s.filter(regex='^a').sum() 
Out[8]: 3

注：的filter的行爲是大熊貓git的主人未經測試，所以我會謹慎使用現在。有an open issue解決這個問題。

來源

2013-09-16 18:32:40

我不知道是否應該直接爲索引提供str方法。 –

我覺得有一個地方的問題....雖然不記得確切的地方。 –

@AndyHayden我忘了'NDFrame.filter'方法將在0.13下工作！ –

您可以使用GROUPBY，它可以將一個函數的索引值（在這種情況下，尋找第一個元素）：

In [11]: example.groupby(lambda x: x[0]).sum() 
Out[11]: 
a 3 
b 7 
dtype: int64 

In [12]: example.groupby(lambda x: x[0]).sum()['a'] 
Out[12]: 3

來源

2013-09-16 18:31:16

+1我總是忘記你可以傳遞一個callable給'groupby'。 –

非常優雅，但我想'groupby'會需要更多的計算超過需要... – hatmatrix

@crippledlambda而不是猜測，你應該總是測試（例如'％timeit'），你可能會感到驚訝。雖然在這種情況下過濾器更*優雅！ –

如何根據謂詞聲明來聚合熊貓系列值？

回答

相關問題