2013-10-30 38 views
12

我使用JavaScript解析器生成器JISON爲我的用戶創建的一些腳本創建解析器。最近我注意到,Firefox上的解析過程比我的頁面支持的任何其他瀏覽器(IE10,最新的Chrome & Opera)都慢。Firefox錯誤RegEx性能

在對生成的解析器的源代碼進行了一點挖掘之後,我將問題簡化爲一行代碼,該代碼執行一些正則表達式來標記要解析的代碼。當然這條線很經常執行。

我用一些隨機字符串(~1300個字符長)和一個非常通用的正則表達式創建了一個小測試用例。這個測試用例措施才能執行正則表達式的10000倍的平均時間(Working example on JSFiddle):

$(document).ready(function() { 
    var str = 'asdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj', 
     regex = new RegExp('^([0-9])+'), 
     durations = [], 
     resHtml = 'Durations:', 
     totalDuration = 0, 
     matches, start; 

    // Perform "timing test" 10 times to get some average duration 
    for (var i = 0; i < 10; i++) { 
     // Execute regex 10000 times and see how long it takes 
     start = window.performance.now(); 
     for (var j = 0; j < 10000; j++) { 
      regex.exec(str); 
     } 
     durations.push(window.performance.now() - start); 
    } 

    // Create output string and update DIV 
    for (var i = 0; i < durations.length; i++) { 
     totalDuration += durations[i]; 
     resHtml += '<br>' + i + ': ' + (parseInt(durations[i] * 100, 10)/100) + ' ms'; 
    } 
    resHtml += '<br>=========='; 
    resHtml += '<br>Avg: ' + (parseInt((totalDuration/durations.length) * 100, 10)/100) + ' ms'; 

    $('#result').html(resHtml); 
}); 

以下是我的機器上測試結果:

火狐24:平均時間是370之間& 450毫秒爲10000個的正則表達式處決
鉻30,歌劇17 IE 10:平均時間爲毫秒之間0.3 & 0.

如果要測試的字符串變大,這種差異會變得更大。 6000字符長的字符串將Firefox的平均時間增加到〜1.5秒(!),而其他瀏覽器仍然需要〜0.5毫秒(!)Working example on JSFiddle with 6000 characters)。

爲什麼Firefox和所有其他瀏覽器之間有這麼大的性能差異,我可以改進它嗎?

請注意,我無法調整自己執行的正則表達式,因爲它們大部分是由解析器生成器生成的,我不想手動更改構建的解析器代碼。

回答

2

這是RegExp捕獲分組,進行了您:

/^[0-9]+/和/或/^(?:[0-9])+/和/或/^([0-9]+)/的數量級比/^([0-9])+/更快。他們應該是可行的替代品。

我期望它在捕獲組時會稍微慢一點,但是這會讓我感到驚訝。然而,緩慢的版本有可能創造大量和大量的捕捉,而其他版本則沒有,所以這似乎是一個重要的區別。

Unscientific jsperf。您可能需要file a bug

+4

由於正則表達式JIT Firefox和Safari使用(Yarr)目前無法編譯具有量化捕獲的正則表達式(在上面的示例中,捕獲到parens後面的'+'),所以速度要慢得多。請參閱https://bugs.webkit.org/show_bug.cgi?id=122891瞭解跟蹤此問題的錯誤。因此,正則表達式在Yarr regexp解釋器中執行,當然比運行JITted代碼慢得多。 –