我喜歡F#的一件事是一個真實的inline
關鍵字。然而,雖然它允許編寫與粘貼代碼塊相同的一階函數,但對於高階函數來說事情並不那麼樂觀。考慮爲什麼F#編譯器不能完全嵌入高階函數的函數參數?
let inline add i = i+1
let inline check i = if (add i) = 0 then printfn ""
let inline iter runs f = for i = 0 to runs-1 do f i
let runs = 100000000
time(fun()->iter runs check) 1
time(fun()->for i = 0 to runs-1 do check i) 1
結果是244 ms
爲iter
和61 ms
用於手動檢查。讓我們深入研究ILSpy。要求直接呼叫的相關功能是:
internal static void [email protected](Microsoft.FSharp.Core.Unit unitVar0)
{
for (int i = 0; i < 100000000; i++)
{
if (i + 1 == 0)
{
Microsoft.FSharp.Core.PrintfFormat<Microsoft.FSharp.Core.Unit, System.IO.TextWriter, Microsoft.FSharp.Core.Unit, Microsoft.FSharp.Core.Unit> format = new Microsoft.FSharp.Core.PrintfFormat<Microsoft.FSharp.Core.Unit, System.IO.TextWriter, Microsoft.FSharp.Core.Unit, Microsoft.FSharp.Core.Unit, Microsoft.FSharp.Core.Unit>("");
Microsoft.FSharp.Core.PrintfModule.PrintFormatLineToTextWriter<Microsoft.FSharp.Core.Unit>(System.Console.Out, format);
}
}
}
With add
內聯。對於iter
相關功能
internal static void [email protected](Microsoft.FSharp.Core.Unit unitVar0)
{
for (int i = 0; i < 100000000; i++)
{
[email protected](i);
}
}
internal static void [email protected](int i)
{
if (i + 1 == 0)
{
Microsoft.FSharp.Core.PrintfFormat<Microsoft.FSharp.Core.Unit, System.IO.TextWriter, Microsoft.FSharp.Core.Unit, Microsoft.FSharp.Core.Unit> format = new Microsoft.FSharp.Core.PrintfFormat<Microsoft.FSharp.Core.Unit, System.IO.TextWriter, Microsoft.FSharp.Core.Unit, Microsoft.FSharp.Core.Unit, Microsoft.FSharp.Core.Unit>("");
Microsoft.FSharp.Core.PrintfModule.PrintFormatLineToTextWriter<Microsoft.FSharp.Core.Unit>(System.Console.Out, format);
return;
}
}
,我們可以看到的性能損失來自一個間接額外的水平。正如性能測試顯示的那樣,JIT編譯器也不會去除這種間接性。是否有理由爲什麼高階函數不能完全內聯?編寫計算內核時這是一件很痛苦的事情。
我的時間組合子(雖然這裏並不真正相關)是
let inline time func n =
func() |> ignore
GC.Collect()
GC.WaitForPendingFinalizers()
let stopwatch = Stopwatch.StartNew()
for i = 0 to n-1 do func() |> ignore
stopwatch.Stop()
printfn "Took %A ms" stopwatch.Elapsed.TotalMilliseconds
請確認您是否在發佈模式下運行此程序時未附加調試器。除此之外,基準似乎有效。您可以通過將工作量增加10倍來消除一次性成本的影響。 – usr
@usr是的,我運行它沒有調試器,並在發佈模式編譯。毫無疑問,性能差異是真實的,因爲它可以從IL代碼中推導出來(禁止JIT優化)。 – Arbil
@Arbil我掛這個問題上有關內聯分析,F#語言設計UserVoice的主題之一:https://fslang.uservoice.com/forums/245727-f-language/suggestions/6137978-better-inlining-analysis -and-啓發式算法 –