2015-12-06 52 views

回答

1

使用dot

import numpy as np 
import pandas as pd 

np.random.seed(0) 

# Numpy 
m1 = np.random.randn(5, 5) 
m2 = np.random.randn(5, 5) 

>>> m1.dot(m2) 
array([[ -5.51837355, -4.08559942, -1.88020209, 2.88961281, 
      0.61755013], 
     [ 1.4732264 , -0.2394676 , -0.34717755, -4.18527913, 
     -1.75550855], 
     [ -0.1871964 , 0.76399007, -0.26550057, -3.43359244, 
     -0.68081106], 
     [ -0.23996774, 0.95331428, -2.833788 , -0.37940614, 
      0.05464387], 
     [ 3.73328914, -0.59578959, 3.96803224, -10.65362381, 
     -4.34460348]]) 

# Pandas 
df1 = pd.DataFrame(m1) 
df2 = pd.DataFrame(m2) 

>>> df1.dot(df2) 
      0   1   2   3   4 
0 -5.518374 -4.085599 -1.880202 2.889613 0.617550 
1 1.473226 -0.239468 -0.347178 -4.185279 -1.755509 
2 -0.187196 0.763990 -0.265501 -3.433592 -0.680811 
3 -0.239968 0.953314 -2.833788 -0.379406 0.054644 
4 3.733289 -0.595790 3.968032 -10.653624 -4.344603 

df3 = pd.DataFrame(np.random.randn(5, 3)) 
df4 = pd.DataFrame(np.random.randn(3, 5)) 

>>> df3.dot(df4) 
      0   1   2   3   4 
0 0.991673 1.954500 0.322110 0.493841 0.080462 
1 0.160482 1.548039 -0.826426 0.972538 -0.048610 
2 0.628194 0.482943 0.742597 -0.236226 0.089525 
3 -0.098316 0.817702 -0.725945 1.271506 -0.309596 
4 -1.053413 0.948427 -2.445940 2.814147 -0.726829 
1

或者到知名dot功能,您可以使用numpy.matmul,如果你有numpy的版本> = 1.10.0

import numpy as np 
import pandas as pd 

np.random.seed(632) 
df1 = pd.DataFrame(np.random.randn(7, 7)) 
df2 = pd.DataFrame(np.random.randn(7, 7)) 

In [68]: np.matmul(df1, df2) 
Out[68]: 
array([[ 0.08535756, -3.05102895, 3.26148284, -6.27736384, -1.52042691, 
     2.40667207, -0.6385153 ], 
     [ 5.29731049, -0.94033606, -0.12675555, 1.10453597, -1.70722837, 
     2.57797682, 2.37629556], 
     [ 0.31841755, -1.46897738, -0.22734008, -4.37852181, -0.98948844, 
     3.49939092, -1.36656608], 
     [ 0.90757446, -4.6364365 , 1.86254589, -4.89078986, 0.31928714, 
     2.3442364 , -2.29896007], 
     [-1.14428758, 6.69735827, -3.8776982 , 6.87574565, 1.38854952, 
     -2.88767356, 1.46302112], 
     [ 0.8771236 , -2.01941938, 1.03461007, 0.30331467, 2.39161032, 
     0.07345672, -1.30557339], 
     [ 0.94310211, -0.54294898, 2.46147932, -3.21588748, -2.98369364, 
     3.73941015, 1.31782966]]) 

性能幾乎相同:

In [71]: %timeit np.dot(df1, df2) 
10000 loops, best of 3: 63.7 µs per loop 

In [73]: %timeit np.matmul(df1, df2) 
10000 loops, best of 3: 64.2 µs per loop 

但是更好的是使用df1.dot(df2)

In [82]: %timeit df1.dot(df2) 
1000 loops, best of 3: 217 µs per loop