2016-07-11 36 views
1

我正在嘗試編寫c#正則表達式,它將過濾下面的規則。解析C#正則表達式的錯誤

  • https://www.test.com/help/about/index.aspx?at=eng&st=png...
  • http://www.test.com/help/about/index.aspx?at=eng&st=png...
  • www.test.com/help/about/index.aspx?at=eng&st=png...
  • test.com/help/about/index.aspx?at=eng&st=png...

我的正則表達式爲:

^(http(s)?(:\/\/))?(www\.)?[a-zA-Z0-9-_\.]+/([-a-zA-Z0-9:%_\+.~#?&//=]*) 

這是工作的罰款WH我正在通過C#在線測試人員進行測試,但是當我試圖放入我的代碼時,出現解析錯誤。

代碼:

public SSLUrl(XElement configurationEntry) 
{ 
    XAttribute xSsl = configurationEntry.Attribute("ssl"); 
    XAttribute xIgnore = configurationEntry.Attribute("ignore"); 

    mUseSSL = false; 

    if (xSsl != null) 
     bool.TryParse(xSsl.Value, out mUseSSL); 

    mIgnore = false; 

    if (xIgnore != null) 
     bool.TryParse(xIgnore.Value, out mIgnore); 

    mRegex = new Regex(HandleRootOperator(configurationEntry.Value), 
     RegexOptions.Compiled | RegexOptions.IgnoreCase); 
} 

示例XML文件:

<?xml version="1.0"?> 
<SSLSwitch> 
<!-- Redirect status code for HTTP and HTTPs--> 
    <http>301</http> 
    <https>301</https> 

    <!-- Do not change HTTP or HTTPS for anything under /system/ --> 
    <url ignore="true">^~/system/</url> 

    <!-- Do not change HTTP or HTTPS for anything in the root folder --> 
    <url ignore="true">^~/[^/]*\.</url> 

<url ignore="true">^(http(s)?(:\/\/))?(www\.)?[a-zA-Z0-9-_\.]+/([-a-zA-Z0-9:%_\+.?&//=]*)</url> 
</SSLSwitch> 

錯誤:

An error occurred while parsing EntityName. Line 45, position 85.

說明:

An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

異常詳細信息:

System.Xml.XmlException: An error occurred while parsing EntityName. Line 45, position 85.

源錯誤:

An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.

**堆棧跟蹤:**

[XmlException: An error occurred while parsing EntityName. Line 45, position 85.] System.Xml.XmlTextReaderImpl.Throw(String res, Int32 lineNo, Int32 linePos) +189
System.Xml.XmlTextReaderImpl.HandleEntityReference(Boolean isInAttributeValue, EntityExpandType expandType, Int32& charRefEndPos) +7432563 System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars) +1042
System.Xml.XmlTextReaderImpl.FinishPartialValue() +79
System.Xml.XmlTextReaderImpl.get_Value() +72
System.Xml.Linq.XContainer.ReadContentFrom(XmlReader r) +225
System.Xml.Linq.XContainer.ReadContentFrom(XmlReader r, LoadOptions o) +75 System.Xml.Linq.XElement.ReadElementFrom(XmlReader r, LoadOptions o) +722 System.Xml.Linq.XElement.Load(XmlReader reader, LoadOptions options) +79 System.Xml.Linq.XElement.Load(String uri, LoadOptions options) +137 Handlers.SSLSwitch..cctor() +102

+3

分享你得到的錯誤將是一個很好的起點。更好的辦法是顯示你的代碼,提高錯誤。 –

+1

請顯示您的代碼以及您收到的錯誤。 – DeanOC

+0

用C#代碼示例和正則表達式讀取的示例xml文件更新了問題 – ram

回答

1

正則表達式中的&被視爲XML實體的開始,後面跟着一個不能被解析爲XML實體的子字符串,因此是錯誤。

我建議

<url ignore="true"><![CDATA[^(https?://)?(www\.)?[\w.-]+/([-\w:%+.?&/=]*)]]></url> 
        ^-------------------------------------------------------^ 

裏面CDATA塊,XML實體作爲文字處理。

請注意,\w幾乎與[a-zA-Z0-9_]相同(如果在編譯正則表達式對象時添加RegexOptions.ECMAScript標誌,它將等於該char類)。

此外,/,正斜槓沒有,有時不應該逃脫,因爲它在.NET正則表達式中沒有任何特殊的含義。在PHP或Perl中,它通常用作正則分隔符來分隔動作/模式/修飾符。在.NET中,可以使用內聯修飾符或RegexOptions標誌來修改某些特殊的正則表達式元字符行爲,因此/不用於分隔這些正則表達式部分。我也刪除了不必要的分組。我不明白爲什麼//用於最後一個字符類,所以我用/替換了它(因爲char類中的//仍然只匹配1個/)。如果您需要定義\,請在字符類中使用\\