我喜歡正則表達式。但是,我剛剛在瀏覽器中運行JavaScript RegExp時無法使用s
標誌。我很好奇爲什麼這個標誌不包括在內?這將是非常有幫助的。爲什麼JavaScript RegExp缺少「s」標誌?
我見過有一個外部庫XRegExp,它啓用了這個s
標誌(以及其他一些標誌),但我也很好奇爲什麼這些額外的(有用的)標誌在標準JavaScript中也不存在。我也不願意包括另一個外部庫...
這裏是一個例子,我試圖解決一個問題,檢測開放/關閉標籤的WordPress的短碼可能有新行內(或我必須插入換行符以改善檢測)。
//
// Let's take some input text, e.g. WordPress shortcodes
//
var exampleText = '[buttongroup class="vertical"][button content="Example 1" class="btn-default"][/button][button class="btn-primary"]Example 2[/button][/buttongroup]'
//
// Now let's say I want to extract the shortcodes and its attributes
// keeping in mind shortcodes can or cannot have closing tags too
//
// Shortcodes which have content between the open/closing tags can contain
// newlines. One of the issues with the flags is that I can't use `s` to make
// the dot character.
//
// When I run this on regex101.com they support the `s` flag (probably with the
// XRegExp library) and everything seems to work well. However when running this
// in the browser I get the "Uncaught SyntaxError: Invalid regular expression
// flags" error.
//
var reGetButtons = /\[button(?:\s+([^\]]+))?\](?:(.*)\[\/button\])?/gims
var reGetButtonGroups = /\[buttongroup(?:\s+([^\]]+))?\](?:(.*)\[\/buttongroup\])?/gims
//
// Some utility methods to extract attributes:
//
// Get an attribute's value
//
// @param string input
// @param string attrName
// @returns string
function getAttrValue (input, attrName) {
var attrValue = new RegExp(attrName + '=\"([^\"]+)\"', 'g').exec(input)
return (attrValue ? window.decodeURIComponent(attrValue[1]) : '')
}
// Get all named shortcode attribute values as an object
//
// @param string input
// @param array shortcodeAttrs
// @returns object
function getAttrsFromString (input, shortcodeAttrs) {
var output = {}
for (var index = 0; index < shortcodeAttrs.length; index++) {
output[shortcodeAttrs[index]] = getAttrValue(input, shortcodeAttrs[index])
}
return output
}
//
// Extract all the buttons and get all their attributes and values
//
function replaceButtonShortcodes (input) {
return input
//
// Need this to avoid some tomfoolery.
// By splitting into newlines I can better detect between open/closing tags,
// however it goes out the window when newlines are within the
// open/closing tags.
//
// It's possible my RegExps above need some adjustments, but I'm unsure how,
// or maybe I just need to replace newlines with a special character that I
// can then swap back with newlines...
//
.replace(/\]\[/g, ']\n[')
// Find and replace the [button] shortcodes
.replace(reGetButtons, function (all, attr, content) {
console.log('Detected [button] shortcode!')
console.log('-- Extracted shortcode components', { all: all, attr: attr, content: content })
// Built the output button's HTML attributes
var attrs = getAttrsFromString(attr, ['class','content'])
console.log('-- Extracted attributes', { attrs: attrs })
// Return the button's HTML
return '<button class="btn ' + (typeof attrs.class !== 'undefined' ? attrs.class : '') + '">' + (content ? content : attrs.content) + '</button>'
})
}
//
// Extract all the button groups like above
//
function replaceButtonGroupShortcodes (input) {
return input
// Same as above...
.replace(/\]\[/g, ']\n[')
// Find and replace the [buttongroup] shortcodes
.replace(reGetButtonGroups, function (all, attr, content) {
console.log('Detected [buttongroup] shortcode!')
console.log('-- Extracted shortcode components', { all: all, attr: attr, content: content })
// Built the output button's HTML attributes
var attrs = getAttrsFromString(attr, ['class'])
console.log('-- Extracted attributes', { attrs: attrs })
// Return the button group's HTML
return '<div class="btn-group ' + (typeof attrs.class !== 'undefined' ? attrs.class : '') + '">' + (typeof content !== 'undefined' ? content : '') + '</div>'
})
}
//
// Do all the extraction on our example text and set within the document's HTML
//
var outputText = replaceButtonShortcodes(exampleText)
outputText = replaceButtonGroupShortcodes(outputText)
document.write(outputText)
使用s
標誌可以讓我輕鬆地做到這一點,但因爲它是不支持的,我不能利用該標誌的好處。
警告任何人都想看看'regexp101.com':__Don' t__。它看起來像惡意軟件網站。 – Andy
https://meta.stackoverflow.com/a/293819/47589 – Amy
@安迪:而http://regex101.com(沒有p)不是(據我所知)和我自己和其他人已經使用它多年。 Matt,在你的代碼評論中,這只是一個錯字嗎? –