2017-01-24 76 views
0

我目前正在研究應該使用XML文件進行配置的速度模板。速度 - 在包含XML的字符串中搜索

我能夠將文件讀入變量。爲了檢查配置,我現在需要在變量中找到某個字符串。在某些情況下,搜索字符串可能是正則表達式。

this thread我知道我可以使用.matches()來搜索RegEx。但無論我嘗試過(參見下面的「測試代碼」),我只會返回「false」,即使我只是試圖搜索其中一個標記。

<html> 
    <body> 
## this example is intended to test searching regular expressions 

## let's start with a simple example: 

#set($simpleText = "This is the string where I will try to find a substring.") 
#set($searchStr = "string") 

1 $simpleText.matches($searchStr)<BR> ## this returns false as .matches() only returns true if the parameter $searchStr (could be regular expression) matches the ENTIRE string ($simpleText) 

#set($searchStr = ".*string.*")  ## .* at the beginning and the end of the search string means any character can be before and after the 'real' search string 

2 $simpleText.matches($searchStr)<BR> ## this returns true, so adding .* at the beginning and the end of the search string seems to work. 

## let's now move on to strings containing XML (as this is the real use case) 

#set($xmlText = '<?xml version="1.0"?> 
<ItemTypes> 
    <ItemType> 
     <Display>L1 Items</Display> 
     <Fields> 
      <FieldLabel>Project ID</FieldLabel> 
      <FieldLabel>Name</FieldLabel> 
      <FieldLabel>Description</FieldLabel> 
      <FieldLabel>Assigned</FieldLabel> 
     <Fields> 
    </ItemType> 
</ItemTypes>') 

3 $xmlText<BR>       ## when printing a string containing XML tags those tags will not be visible in the printout (probably because they are interpreted as kind of html tags...) 

#set($escapedXmlText = $escapeTool.xml($xmlText)) ## escapeTool will ensure that the tags will also be printed (visible) 

4 $escapedXmlText<BR>     ## this printout will also display the tags 

## let's now try to find the string 'Display' in xmlText the same way as we did in the simple example at the beginning: 

#set($searchStr = '.*Display.*') 

5 $xmlText.matches($searchStr)<BR>   ## returns false but WHY? 
6 $escapedXmlText.matches($searchStr)<BR> ## returns false but WHY? 

    </body> 
</html> 

有沒有人有一個想法,爲什麼打印輸出5和6在最後都返回false?

+0

您的錯誤從嘗試使用正則表達式解析XML開始。不要那樣做,永遠。這是行不通的。使用XML解析器。有[XmlTool](https://velocity.apache.org/tools/devel/apidocs/org/apache/velocity/tools/generic/XmlTool.html),看起來很有希望,並且還有一個[專門討論在Deveoper的指南中使用XML](http://velocity.apache.org/engine/1.7/developer-guide.html#velocity-and-xml) – Tomalak

+0

首先,thx爲您的輸入。我也遇到了XmlTool的東西,但如果我理解正確,這將需要訪問提供上下文的環境。 – Andreas

+0

在我的情況下,Velocity引擎包含在應用程序中,我懷疑我可以以任何方式影響上下文。然而,在進一步調查後,我發現了這一點。可能不匹配行結束符。也讓。匹配那些需要使用「嵌入式標誌表達式(?s)」來啓用「DOTALL模式」的行終止符。由於我從來沒有遇到過這些事情,我仍然不完全確定,但似乎工作。 – Andreas

回答

0

我想我自己找到了答案,雖然我不完全確定(很高興得到任何反饋)。 下面我的測試例如通過延長几行(也與我的發現含評論)的:

<html> 
    <body> 
## this example is intended to test searching regular expressions 

## let's start with a simple example: 

#set($simpleText = "This is the string where I will try to find a substring.") 
#set($searchStr = "string") 

1 $simpleText.matches($searchStr)<BR> ## this returns false as .matches() only returns true if the parameter $searchStr (could be regular expression) matches the ENTIRE string ($simpleText) 

#set($searchStr = ".*string.*")  ## .* at the beginning and the end of the search string means any character can be before and after the 'real' search string 

2 $simpleText.matches($searchStr)<BR> ## this returns true, so adding .* at the beginning and the end of the search string seems to work. 

## let's now move on to strings containing XML (as this is the real use case) 

#set($xmlText = '<?xml version="1.0"?> 
<ItemTypes> 
    <ItemType> 
     <Display>L1 Items</Display> 
     <Fields> 
      <FieldLabel>Project ID</FieldLabel> 
      <FieldLabel>Name</FieldLabel> 
      <FieldLabel>Description</FieldLabel> 
      <FieldLabel>Assigned</FieldLabel> 
     <Fields> 
    </ItemType> 
</ItemTypes>') 

3 $xmlText<BR>       ## when printing a string containing XML tags those tags will not be visible in the printout (probably because they are interpreted as kind of html tags...) 

#set($escapedXmlText = $escapeTool.xml($xmlText)) ## escapeTool will ensure that the tags will also be printed (visible) 

4 $escapedXmlText<BR>     ## this printout will also display the tags 

## let's now try to find the string 'Display' in xmlText the same way as we did in the simple example at the beginning: 

#set($searchStr = '.*Display.*') 

5 $xmlText.matches($searchStr)<BR>   ## returns false, obviously because . does not match "line terminators" (cf. https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html) 
6 $escapedXmlText.matches($searchStr)<BR> ## also returns false 

## on https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#lt you can find the following info: 
## "The regular expression . matches any character except a line terminator unless the DOTALL flag is specified." 
## https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#DOTALL says: 
## "Dotall mode can also be enabled via the embedded flag expression (?s)." 
## and https://kodejava.org/how-do-i-write-embedded-flag-expression/ finally says that embedded flag expression are to be provided at the beginning of the regex. 
## So, let's now try (this time also including some special characters like '<', '>', '/'): 

#set($searchStr = '(?s).*<Display>L1 Items</Display>.*') 

7 $xmlText.matches($searchStr)<BR>   ## FINALLY RETURNS TRUE!! 
8 $escapedXmlText.matches($searchStr)<BR> ## still return false as in the escaped XML special characters like '<' are replaced/escaped 

    </body> 
</html> 

所以,從正則表達式似乎做的伎倆(S?)!

相關問題