簡單的屬性與編碼的雙引號

我有一個字符串看起來像這樣解析：myAttribute =「一些文本」，並正在分析他們這樣

Pattern attributePattern = Pattern.compile("[a-z0-9]*=\"[^\"]*\"");

但是，我意識到人們可能想在其屬性值中使用雙引號。

例如myAttribute =「一些文本用雙引號\」這裏」

如何調整我的正則表達式來處理這個

這裏是我的代碼解析屬性

private HashMap<String, String> findAttributes(String macroAttributes) { 
    Matcher matcher = attributePattern.matcher(macroAttributes); 
    HashMap<String, String> map = new HashMap<String, String>(); 
    while (matcher.find()) { 
     String attribute = macroAttributes.substring(matcher.start(), matcher.end()); 
     int equalsIndex = attribute.indexOf("="); 
     String attrName = attribute.substring(0, equalsIndex); 
     String attrValue = attribute.substring(equalsIndex+2, attribute.length()-1); 
     map.put(attrName, attrValue); 
    } 
    return map; 
} 

findAttributes("my=\"some text with a double quote \\\" here\"");

應該返回地圖尺寸1 值應該是一些文本用雙引號\」這裏

來源

2013-03-04 Bruce Lowe

您可以使用交替和該

積極向後斷言

(?:[^\"]*|(?<=\\\\)\")*是交替，匹配或者[^\"]*或(?<=\\\\)\"

(?<=\\\\)\"是匹配一」，但只有當它是由齒隙之前。

來源

2013-03-04 10:10:03 stema

您的解決方案完美地工作！我看到我現在必須擺脫正常的斜線，但這很好。許多tx – 2013-03-04 14:30:43

你可以用消極的外觀背後，看是否有報價之前的反斜槓，但如果反斜槓本身可以也逃脫失敗：

myAttribute="some text with a trailing backslash \\"

如果可能的話，嘗試是這樣的：

Pattern.compile("[a-zA-Z0-9]+=\"([^\"\\\\]|\\\\[\"\\\\])*\"")

一個快速的解釋：

[a-zA-Z0-9]+  # the key 
=    # a literal '=' 
\"    # a literal '"' 
(    # start group 
    [^\"\\\\]  # any char except '\' and '"' 
    |    # OR 
    \\\\[\"\\\\] # either '\\' or '\"' 
)*    # end group and repeat zero or more times 
\"    # a literal '"'

一個快速演示：

public class Main { 

    private static HashMap<String, String> findAttributes(Pattern p, String macroAttributes) { 
     Matcher matcher = p.matcher(macroAttributes); 
     HashMap<String, String> map = new HashMap<String, String>(); 
     while (matcher.find()) { 
      map.put(matcher.group(1), matcher.group(2)); 
     } 
     return map; 
    } 

    public static void main(String[] args) { 
     final String text = "my=\"some text with a double quote \\\" here\""; 
     System.out.println(findAttributes(Pattern.compile("([a-z0-9]+)=\"((?:[^\"\\\\]|\\\\[\"\\\\])*)\""), text)); 
     System.out.println(findAttributes(Pattern.compile("([a-z0-9]*)=\"((?:[^\"]*|(?<=\\\\)\")*)\""), text)); 
    } 
}

會打印：

{my=some text with a double quote \" here} 
{my=some text with a double quote \}

來源

2013-03-04 10:47:28

謝謝，我測試了這種模式，但它似乎沒有工作。我已經在問題中加入了一些示例代碼來展示我目前正在做的事情。有了你的模式，它似乎仍然以「\」結束。「Stema似乎已經提出了一個可行的方案，所以我將其標記爲正確的（如果你改正了你的方案，我會很樂意爲你的時間和精力給你一個讚賞） – 2013-03-04 14:28:00

@BruceLowe，我只是測試它太強了，它就像一個魅力。結帳我張貼的演示。 – 2013-03-04 14:38:38

@BruceLowe，並請注意，您可以使用[比賽團體]（http://www.regular-expressions.info /brackets.html）來提取鍵和值（你不需要自己做任何'substring'） – 2013-03-04 14:50:21

簡單的屬性與編碼的雙引號

回答

相關問題