2013-05-19 61 views
14

如果你看看下面的時序:INT .__ mul__,執行2X慢operator.mul

C:\Users\Henry>python -m timeit -s "mul = int.__mul__" "reduce(mul,range(10000))" 
1000 loops, best of 3: 908 usec per loop 

C:\Users\Henry>python -m timeit -s "from operator import mul" "reduce(mul,range(10000))" 
1000 loops, best of 3: 410 usec per loop 

有執行速度

reduce(int.__mul__,range(10000))reduce(mul,range(10000))後者速度更快之間的差異顯著。

使用dis模塊看看發生了什麼事:

使用int.__mul__方法:

C:\Users\Henry>python 
Python 2.7.4 (default, Apr 6 2013, 19:55:15) [MSC v.1500 64 bit (AMD64)] on win32 
Type "help", "copyright", "credits" or "license" for more information. 
>>> mul = int.__mul__ 
>>> def test(): 
...  mul(1,2) 
... 
>>> import dis 
>>> dis.dis(test) 
    2   0 LOAD_GLOBAL    0 (mul) 
       3 LOAD_CONST    1 (1) 
       6 LOAD_CONST    2 (2) 
       9 CALL_FUNCTION   2 
      12 POP_TOP 
      13 LOAD_CONST    0 (None) 
      16 RETURN_VALUE 
>>> 

與操作mul方法

C:\Users\Henry>python 
Python 2.7.4 (default, Apr 6 2013, 19:55:15) [MSC v.1500 64 bit (AMD64)] on win32 
Type "help", "copyright", "credits" or "license" for more information. 
>>> from operator import mul 
>>> def test(): 
...  mul(1,2) 
... 
>>> import dis 
>>> dis.dis(test) 
    2   0 LOAD_GLOBAL    0 (mul) 
       3 LOAD_CONST    1 (1) 
       6 LOAD_CONST    2 (2) 
       9 CALL_FUNCTION   2 
      12 POP_TOP 
      13 LOAD_CONST    0 (None) 
      16 RETURN_VALUE 
>>> 

他們似乎是相同的,那麼,爲什麼執行速度有差異嗎?我指的是CPython的Python實現


同樣發生在python3的:

$ python3 -m timeit -s 'mul=int.__mul__;from functools import reduce' 'reduce(mul, range(10000))' 
1000 loops, best of 3: 1.18 msec per loop 
$ python3 -m timeit -s 'from operator import mul;from functools import reduce' 'reduce(mul, range(10000))' 
1000 loops, best of 3: 643 usec per loop 
$ python3 -m timeit -s 'mul=lambda x,y:x*y;from functools import reduce' 'reduce(mul, range(10000))' 
1000 loops, best of 3: 1.26 msec per loop 
+5

您正在查看'test()'的字節碼反彙編,它只是調用'mul',因此在兩種情況下都是相同的。這是'mul'的兩個實現可能有所不同。 –

+0

@HristoIliev謝謝,我沒有說它只是拆卸測試。我想這樣做更有意義。我會去看看這些如何實施更多。 – HennyH

+0

你使用python兩個嗎?問題可能是int的mul會溢出並呼叫long的mul,而運營商避免這些額外的呼叫。 – Bakuriu

回答

14

int.__mul__是一個插槽的包裝,即一個PyWrapperDescrObject,而operator.mul是BUIT的函數。 我認爲相反的執行速度是由這種差異造成的。

>>> int.__mul__ 
<slot wrapper '__mul__' of 'int' objects> 
>>> operator.mul 
<built-in function mul> 

當我們調用一個PyWrapperDescrObjectwrapperdescr_call被調用。


static PyObject * 
wrapperdescr_call(PyWrapperDescrObject *descr, PyObject *args, PyObject *kwds) 
{ 
    Py_ssize_t argc; 
    PyObject *self, *func, *result; 

    /* Make sure that the first argument is acceptable as 'self' */ 
    assert(PyTuple_Check(args)); 
    argc = PyTuple_GET_SIZE(args); 
    if (argc d_type->tp_name); 
     return NULL; 
    } 
    self = PyTuple_GET_ITEM(args, 0); 
    if (!_PyObject_RealIsSubclass((PyObject *)Py_TYPE(self), 
            (PyObject *)(descr->d_type))) { 
     PyErr_Format(PyExc_TypeError, 
        "descriptor '%.200s' " 
        "requires a '%.100s' object " 
        "but received a '%.100s'", 
        descr_name((PyDescrObject *)descr), 
        descr->d_type->tp_name, 
        self->ob_type->tp_name); 
     return NULL; 
    } 

    func = PyWrapper_New((PyObject *)descr, self); 
    if (func == NULL) 
     return NULL; 
    args = PyTuple_GetSlice(args, 1, argc); 
    if (args == NULL) { 
     Py_DECREF(func); 
     return NULL; 
    } 
    result = PyEval_CallObjectWithKeywords(func, args, kwds); 
    Py_DECREF(args); 
    Py_DECREF(func); 
    return result; 
} 

讓我們看看我們發現了什麼!

func = PyWrapper_New((PyObject *)descr, self); 

構建了一個新的PyWrapper對象。這會顯着降低執行速度。 有時候,創建一個新對象比運行一個簡單函數需要更多的時間。
因此,int.__mul__慢於operator.mul並不令人驚訝。