我想出了一個使用XML工具(XOM,http://www.xom.nu)來保存樹的部分解決方案。首先是代碼,然後是一個示例解析。首先,轉義字符(\,(和))被去掉(這裏我使用BS,LB和RB),然後將其餘的括號轉換爲XML標記,然後解析XML並重新轉義字符。還需要一個BNF for Java 1.6正則表達式量詞,如?:,{d,d}等等。
public static Element parseRegex(String regex) throws Exception {
regex = regex.replaceAll("\\\\", "BS");
regex.replaceAll("BS\\(", "LB");
regex.replaceAll("BS\\)", "RB");
regex = regex.replaceAll("\\(", "<bracket>");
regex.replaceAll("\\)", "</bracket>");
Element regexX = new Builder().build(new StringReader(
"<regex>"+regex+"</regex>")).getRootElement();
extractCaptureGroupContent(regexX);
return regexX;
}
private static String extractCaptureGroupContent(Element regexX) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < regexX.getChildCount(); i++) {
Node childNode = regexX.getChild(i);
if (childNode instanceof Text) {
Text t = (Text)childNode;
String s = t.getValue();
s = s.replaceAll("BS", "\\\\").replaceAll("LB",
"\\(").replaceAll("RB", "\\)");
t.setValue(s);
sb.append(s);
} else {
sb.append("("+extractCaptureGroupContent((Element)childNode)+")");
}
}
String capture = sb.toString();
regexX.addAttribute(new Attribute("capture", capture));
return capture;
}
例如:
@Test
public void testParseRegex2() throws Exception {
String regex = "(.*(\\(b\\))c(d(e)))";
Element regexElement = ParserUtil.parseRegex(regex);
CMLUtil.debug(regexElement, "x");
}
給出:
看到@anthony。我澄清了這個問題 – 2009-09-15 22:56:45