2016-02-04 33 views
-2

請您在下面的片段中解釋FFI.cast的低性能嗎?luajit中的緩慢FFI.cast

 
prof = require 'profile' 

local ffi = require("ffi") 

ffi.cdef[[ 
struct message { 
    int field_a; 
}; 

]] 

function cast_test1() 
    bytes = ffi.new("char[100000000]") 

    sum = 0 
    t1 = prof.rdtsc() 
    for i=1,1000000 do 
     sum = sum + i 
    end 
    t2 = prof.rdtsc() 

    print("test1", tonumber(t2-t1)) 
end 

function cast_test2() 
    bytes = ffi.new("char[100000000]") 

    sum = 0 
    t1 = prof.rdtsc() 
    for i=1,1000000 do 
     sum = sum + i 
     msg = ffi.cast("struct message *", bytes+ i * 16) 
--  msg.field_a = i 
    end 
    t2 = prof.rdtsc() 

    print("test2", tonumber(t2-t1)) 
end 

cast_test1() 
cast_test2() 

看起來像鑄造循環運行速度慢大約30倍。任何想法如何克服這一點?

 
% luajit -v cast_tests.lua 
LuaJIT 2.0.3 -- Copyright (C) 2005-2014 Mike Pall. http://luajit.org/ 
test1 3227528 
test2 94474000 
+0

你嘗試比較luajit -jv - jdump ?用不同的方法將代碼放入單獨的文件中。第二個循環分配對象並涉及GC ... –

+0

是的,我看到了分配。但問題是如何擺脫它們並有效地使用投射。 –

+1

使用'local'! Lua沒有自動範圍;這個例子中的所有變量(除了'ffi')都是全局變量。這會影響性能。 –

回答

0

看起來像全球味精變量是主要的罪魁禍首。當地更換它給20倍的加速:)

這既爲lualit-2.0.3和lualit-2.1

 
function cast_test3() 
    local bytes = ffi.new("char[100000000]") 
    local sum = 0 
    local t1 = prof.rdtsc() 
    for i=1,1000000 do 
     sum = sum + i 
     local msg = ffi.cast("struct message *", bytes+ i * 4) 
     msg.field_a = i 
    end 
    local t2 = prof.rdtsc() 
    local sum2 = 0 
    for i=1,1000000 do 
     local msg = ffi.cast("struct message *", bytes+ i * 4) 
     sum2 = sum2 + msg.field_a 
    end 

    local t3 = prof.rdtsc() 
    print(sum, sum2) 
    print("test3", tonumber(t2-t1), tonumber(t3-t2)) 
end 

cast_test3() 

結果有關:

 
% /usr/bin/luajit -v cast_tests.lua   ~/Projects/lua_tests/lua_rdtsc 
LuaJIT 2.0.3 -- Copyright (C) 2005-2014 Mike Pall. http://luajit.org/ 
500000500000 500000500000 
test3 4502508 4850884