爲什麼JavaScript RegExp缺少「s」標誌？

-3

我喜歡正則表達式。但是，我剛剛在瀏覽器中運行JavaScript RegExp時無法使用s標誌。我很好奇爲什麼這個標誌不包括在內？這將是非常有幫助的。爲什麼JavaScript RegExp缺少「s」標誌？

我見過有一個外部庫XRegExp，它啓用了這個s標誌（以及其他一些標誌），但我也很好奇爲什麼這些額外的（有用的）標誌在標準JavaScript中也不存在。我也不願意包括另一個外部庫...

這裏是一個例子，我試圖解決一個問題，檢測開放/關閉標籤的WordPress的短碼可能有新行內（或我必須插入換行符以改善檢測）。

// 
 
// Let's take some input text, e.g. WordPress shortcodes 
 
// 
 
var exampleText = '[buttongroup class="vertical"][button content="Example 1" class="btn-default"][/button][button class="btn-primary"]Example 2[/button][/buttongroup]' 
 

 
// 
 
// Now let's say I want to extract the shortcodes and its attributes 
 
// keeping in mind shortcodes can or cannot have closing tags too 
 
// 
 
// Shortcodes which have content between the open/closing tags can contain 
 
// newlines. One of the issues with the flags is that I can't use `s` to make 
 
// the dot character. 
 
// 
 
// When I run this on regex101.com they support the `s` flag (probably with the 
 
// XRegExp library) and everything seems to work well. However when running this 
 
// in the browser I get the "Uncaught SyntaxError: Invalid regular expression 
 
// flags" error. 
 
// 
 
var reGetButtons = /\[button(?:\s+([^\]]+))?\](?:(.*)\[\/button\])?/gims 
 
var reGetButtonGroups = /\[buttongroup(?:\s+([^\]]+))?\](?:(.*)\[\/buttongroup\])?/gims 
 

 
// 
 
// Some utility methods to extract attributes: 
 
// 
 

 
// Get an attribute's value 
 
// 
 
// @param string input 
 
// @param string attrName 
 
// @returns string 
 
function getAttrValue (input, attrName) { 
 
    var attrValue = new RegExp(attrName + '=\"([^\"]+)\"', 'g').exec(input) 
 
    return (attrValue ? window.decodeURIComponent(attrValue[1]) : '') 
 
} 
 

 
// Get all named shortcode attribute values as an object 
 
// 
 
// @param string input 
 
// @param array shortcodeAttrs 
 
// @returns object 
 
function getAttrsFromString (input, shortcodeAttrs) { 
 
    var output = {} 
 
    for (var index = 0; index < shortcodeAttrs.length; index++) { 
 
    output[shortcodeAttrs[index]] = getAttrValue(input, shortcodeAttrs[index]) 
 
    } 
 
    return output 
 
} 
 

 
// 
 
// Extract all the buttons and get all their attributes and values 
 
// 
 
function replaceButtonShortcodes (input) { 
 
    return input 
 
    // 
 
    // Need this to avoid some tomfoolery. 
 
    // By splitting into newlines I can better detect between open/closing tags, 
 
    // however it goes out the window when newlines are within the 
 
    // open/closing tags. 
 
    // 
 
    // It's possible my RegExps above need some adjustments, but I'm unsure how, 
 
    // or maybe I just need to replace newlines with a special character that I 
 
    // can then swap back with newlines... 
 
    // 
 
    .replace(/\]\[/g, ']\n[') 
 
    // Find and replace the [button] shortcodes 
 
    .replace(reGetButtons, function (all, attr, content) { 
 
     console.log('Detected [button] shortcode!') 
 
     console.log('-- Extracted shortcode components', { all: all, attr: attr, content: content }) 
 

 
     // Built the output button's HTML attributes 
 
     var attrs = getAttrsFromString(attr, ['class','content']) 
 
     console.log('-- Extracted attributes', { attrs: attrs }) 
 
     
 
     // Return the button's HTML 
 
     return '<button class="btn ' + (typeof attrs.class !== 'undefined' ? attrs.class : '') + '">' + (content ? content : attrs.content) + '</button>' 
 
    }) 
 
} 
 

 
// 
 
// Extract all the button groups like above 
 
// 
 
function replaceButtonGroupShortcodes (input) { 
 
    return input 
 
    // Same as above... 
 
    .replace(/\]\[/g, ']\n[') 
 
    // Find and replace the [buttongroup] shortcodes 
 
    .replace(reGetButtonGroups, function (all, attr, content) { 
 
     console.log('Detected [buttongroup] shortcode!') 
 
     console.log('-- Extracted shortcode components', { all: all, attr: attr, content: content }) 
 
     
 
     // Built the output button's HTML attributes 
 
     var attrs = getAttrsFromString(attr, ['class']) 
 
     console.log('-- Extracted attributes', { attrs: attrs }) 
 
     
 
     // Return the button group's HTML 
 
     return '<div class="btn-group ' + (typeof attrs.class !== 'undefined' ? attrs.class : '') + '">' + (typeof content !== 'undefined' ? content : '') + '</div>' 
 
    }) 
 
} 
 

 
// 
 
// Do all the extraction on our example text and set within the document's HTML 
 
// 
 
var outputText = replaceButtonShortcodes(exampleText) 
 
outputText = replaceButtonGroupShortcodes(outputText) 
 
document.write(outputText)

使用s標誌可以讓我輕鬆地做到這一點，但因爲它是不支持的，我不能利用該標誌的好處。

來源

2017-09-25 Matt Scheurich

警告任何人都想看看'regexp101.com'：__Don' t__。它看起來像惡意軟件網站。 – Andy

https://meta.stackoverflow.com/a/293819/47589 – Amy

@安迪：而http://regex101.com（沒有p）不是（據我所知）和我自己和其他人已經使用它多年。 Matt，在你的代碼評論中，這只是一個錯字嗎？ –

沒有什麼大的邏輯，它只是沒有包括在內，就像其他環境中JavaScript所沒有的其他正則表達式特性一樣（到目前爲止）。

這是in the process of being added now。目前~~第3階段，所以也許ES2018，也許不是~~階段 4月爲2017年如此將在ES2018，~~但賠率是高~~，你會看到正在向今年儘快新銳瀏覽器~~支持。~~

（Look-behind和unicode property escapes也都在卡...）

旁註：

當我regex101.com運行這個他們所支持的s標誌...

如果通過菜單將正則表達式類型設置爲JavaScript，則不適用。點擊菜單按鈕，在左上角：

...並改變「味」爲JavaScript：

你可能離開它的默認值，這是PCRE ，它確實支持s標誌。

他們用來使這個更明顯。因爲他們把它藏在一個菜單上，你不是遠程我見過的第一個人沒有設置正確的...

來源

2017-09-25 15:40:21

感謝您的鏈接！我也設法使用'[^]'hack來解決我的問題。 –

@Matt實際上'[^]'不是黑客，它不匹配任何東西，實際上任何東西。 –

@MattScheurich你也可以使用'[\ s \ S]''，它也可以與其他正則表達式引擎一起工作。 –

爲什麼JavaScript RegExp缺少「s」標誌？

回答

相關問題