我使用pandas
發現了一個很好的解決方案。
import pandas as pd, numpy as np
x = 50 * np.random.randn(50, 5)
dfx = pd.DataFrame(x)
bins = np.linspace(min(dfx[0]), max(dfx[0]), 10)
first_binning = pd.cut(dfx[0], bins)
bins = np.linspace(min(dfx[1]), max(dfx[1]), 5)
second_binning = pd.cut(ddx[1], bins)
groups = dfx.groupby([first_binning, second_binning])
,現在你可以(取決於您的數據):
In [160]: groups.size()
Out[160]:
0 1
(-101.273, -71.403] (50.481, 109.902] 2
(-71.403, -41.532] (-68.362, -8.94] 4
(-8.94, 50.481] 3
(50.481, 109.902] 1
(-41.532, -11.661] (-68.362, -8.94] 4
(-8.94, 50.481] 3
(50.481, 109.902] 2
(-11.661, 18.21] (-127.783, -68.362] 2
(-8.94, 50.481] 6
(50.481, 109.902] 1
(18.21, 48.0806] (-127.783, -68.362] 2
(-68.362, -8.94] 5
(-8.94, 50.481] 3
(50.481, 109.902] 3
(48.0806, 77.951] (-68.362, -8.94] 2
(-8.94, 50.481] 4
(77.951, 107.822] (-68.362, -8.94] 1
dtype: int64
看到計數和
In [163]: groups.indices
Out[163]:
{('(-101.273, -71.403]', '(50.481, 109.902]'): array([20, 37]),
('(-11.661, 18.21]', '(-127.783, -68.362]'): array([26, 39]),
('(-11.661, 18.21]', '(-8.94, 50.481]'): array([ 4, 14, 18, 34, 35, 45]),
('(-11.661, 18.21]', '(50.481, 109.902]'): array([17]),
('(-41.532, -11.661]', '(-68.362, -8.94]'): array([ 3, 13, 16, 30]),
('(-41.532, -11.661]', '(-8.94, 50.481]'): array([25, 38, 48]),
('(-41.532, -11.661]', '(50.481, 109.902]'): array([0, 5]),
('(-71.403, -41.532]', '(-68.362, -8.94]'): array([ 1, 24, 32, 47]),
('(-71.403, -41.532]', '(-8.94, 50.481]'): array([ 6, 19, 31]),
('(-71.403, -41.532]', '(50.481, 109.902]'): array([12]),
('(18.21, 48.0806]', '(-127.783, -68.362]'): array([21, 46]),
('(18.21, 48.0806]', '(-68.362, -8.94]'): array([ 2, 15, 22, 33, 40]),
('(18.21, 48.0806]', '(-8.94, 50.481]'): array([ 7, 28, 36]),
('(18.21, 48.0806]', '(50.481, 109.902]'): array([ 9, 23, 49]),
('(48.0806, 77.951]', '(-68.362, -8.94]'): array([41, 42]),
('(48.0806, 77.951]', '(-8.94, 50.481]'): array([27, 29, 43, 44]),
('(77.951, 107.822]', '(-68.362, -8.94]'): array([11])}
看,當然數據集記錄索引。
does [numpy.digitize](https://docs.scipy.org/doc/numpy/reference/generated/numpy.digitize.html)有幫助嗎? –