2017-06-02 65 views
3

我有一個非常奇怪的問題,使用C#XDocument.Validate或XMLReaderSettings和所需的配置,針對有效的XSD驗證XML文檔。問題是:當XML文檔中存在錯誤時,驗證過程無法在特定條件下捕獲所有錯誤,並且我無法找到這種異議的模式。XDocument.Validate沒有捕獲所有針對XSD的錯誤

這裏是我的XSD:

<?xml version="1.0" encoding="utf-8"?> 
 
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" 
 
\t \t \t targetNamespace="http://www.somesite.com/somefolder/messages" 
 
\t \t \t xmlns:xs="http://www.w3.org/2001/XMLSchema"> 
 
    <xs:element name="Message"> 
 
    <xs:complexType> 
 
    <xs:sequence> 
 
     <xs:element name="Header"> 
 
     <xs:complexType> 
 
      <xs:sequence> 
 
      <xs:element name="MessageId" type="xs:string" /> 
 
      <xs:element name="MessageSource" type="xs:string" /> 
 
      </xs:sequence> 
 
     </xs:complexType> 
 
    </xs:element> 
 
    <xs:element name="Body"> 
 
     <xs:complexType> 
 
      <xs:sequence> 
 
      <xs:element name="Abc001"> 
 
       <xs:complexType> 
 
        <xs:sequence> 
 
        <xs:element name="Abc002" type="xs:string" /> 
 
        <xs:element name="Abc003" type="xs:string" minOccurs="0" /> 
 
        <!--<xs:element name="Abc004" type="xs:string" />--> 
 
        <xs:element name="Abc004"> 
 
         <xs:simpleType> 
 
         <xs:restriction base="xs:string"> 
 
          <xs:maxLength value="200"/> 
 
         </xs:restriction> 
 
         </xs:simpleType> 
 
        </xs:element> 
 
         <xs:element name="Abc005"> 
 
         <xs:complexType> 
 
          <xs:sequence> 
 
           <xs:element name="Abc006" type="xs:unsignedShort" /> 
 
           <xs:element name="Abc007"> 
 
           <xs:complexType> 
 
            <xs:sequence> 
 
            <xs:element name="Abc008" type="xs:string"/> 
 
            <xs:element name="Abc009" type="xs:string" minOccurs="0"/> 
 
            <xs:element name="Abc010" type="xs:string"/> 
 
            </xs:sequence> 
 
           </xs:complexType> 
 
           </xs:element> 
 
           <xs:element name="Abc011" type="xs:date" /> 
 
           <xs:element name="Abc012"> 
 
           <xs:complexType> 
 
            <xs:sequence> 
 
            <xs:element name="Abc013" type="xs:string" /> 
 
            <xs:element name="Abc014" type="xs:string" /> 
 
            </xs:sequence> 
 
           </xs:complexType> 
 
           </xs:element> 
 
          </xs:sequence> 
 
         </xs:complexType> 
 
         </xs:element> 
 
        </xs:sequence> 
 
       </xs:complexType> 
 
      </xs:element> 
 
      </xs:sequence> 
 
     </xs:complexType> 
 
    </xs:element> 
 
    </xs:sequence> 
 
    </xs:complexType> 
 
</xs:element> 
 
</xs:schema>

這裏是正在驗證針對該XSD的XML文檔:

<?xml version="1.0" encoding="utf-8"?> 
 
<Message xmlns="http://www.somesite.com/somefolder/messages"> 
 
\t <Header> 
 
\t \t <MessageId>Lorem</MessageId> 
 
\t \t <MessageSource>Ipsum</MessageSource> 
 
\t </Header> 
 
\t <Body> 
 
\t \t <Abc001> 
 
\t \t \t <Abc002>dolor</Abc002> 
 
\t \t \t <Abc003>sit amet</Abc003> 
 
\t \t \t <Abc004>consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.</Abc004> 
 
\t \t \t <Abc005> 
 
\t \t \t \t <Abc006>1234</Abc006> 
 
\t \t \t \t <Abc007> 
 
\t \t \t \t \t <Abc008>Ut enim</Abc008> 
 
\t \t \t \t \t <Abc009>ad</Abc009> 
 
\t \t \t \t \t <Abc010>minim</Abc010> 
 
\t \t \t \t </Abc007> 
 
\t \t \t \t <Abc011>1982-10-17</Abc011> 
 
\t \t \t \t <Abc012> 
 
\t \t \t \t \t <Abc013>veniam</Abc013> 
 
\t \t \t \t \t <Abc014>nostrud</Abc014> 
 
\t \t \t \t </Abc012> 
 
\t \t \t </Abc005> 
 
\t \t </Abc001> 
 
\t </Body> 
 
</Message>

現在,當我在XML中引入一些驗證錯誤並根據XSD驗證它時,它確實發現了所有錯誤。這裏是容易出錯的XML(我已標記引入的錯誤在哪裏):

<?xml version="1.0" encoding="utf-8"?> 
 
<Message xmlns="http://www.somesite.com/somefolder/messages"> 
 
\t <Header> 
 
\t \t <MessageId>Lorem</MessageId> 
 
\t \t <MessageSource>Ipsum</MessageSource> 
 
\t </Header> 
 
\t <Body> 
 
\t \t <Abc001> 
 
\t \t \t <Abc002>dolor</Abc002> 
 
\t \t \t <Abc003>sit amet</Abc003> 
 
\t \t \t 
 
\t \t \t <!--The value for Abc004 is increased beyond the allowed 200 characters--> 
 
\t \t \t 
 
\t \t \t <Abc004>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</Abc004> 
 
\t \t \t <Abc005> 
 
\t \t \t \t <Abc006>1234</Abc006> 
 
\t \t \t \t <Abc007> 
 
\t \t \t \t \t <Abc008>Ut enim</Abc008> 
 
\t \t \t \t \t <ABC009>AD</ABC009> 
 
\t \t \t \t \t 
 
\t \t \t \t \t <!--<Abc010>minim</Abc010> Required element removed--> 
 
\t \t \t \t </Abc007> 
 
\t \t \t \t 
 
\t \t \t \t <!--Date formate below is wrong--> 
 
\t \t \t \t <Abc011>1982-10-37</Abc011> 
 
\t \t \t \t 
 
\t \t \t \t <Abc012> 
 
\t \t \t \t \t <Abc013>veniam</Abc013> 
 
\t \t \t \t \t <Abc014>nostrud</Abc014> 
 
\t \t \t \t </Abc012> 
 
\t \t \t </Abc005> 
 

 
\t \t \t <!--the element below is not allowed--> 
 
\t \t \t <Abc15>Not allowed</Abc15> 
 
\t \t </Abc001> 
 
\t </Body> 
 
</Message>

這裏是我得到的XML,顯示所有錯誤:

<MessageResponse xmlns="http://www.somesite.com/somefolder/messages"> 
 
    <Result>false</Result> 
 
    <Status>Failed</Status> 
 
    <FaultCount>4</FaultCount> 
 
    <Faults> 
 
     <Fault> 
 
      <FaultCode>ERR01</FaultCode> 
 
      <FaultMessage>The 'http://www.somesite.com/somefolder/messages:Abc004' element is invalid - The value 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.' is invalid according to its datatype 'String' - The actual length is greater than the MaxLength value.</FaultMessage> 
 
     </Fault> 
 
     <Fault> 
 
      <FaultCode>ERR02</FaultCode> 
 
      <FaultMessage>The element 'Abc007' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'ABC009' in namespace 'http://www.somesite.com/somefolder/messages'. List of possible elements expected: 'Abc009, Abc010' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage> 
 
     </Fault> 
 
     <Fault> 
 
      <FaultCode>ERR03</FaultCode> 
 
      <FaultMessage>The 'http://www.somesite.com/somefolder/messages:Abc011' element is invalid - The value '1982-10-37' is invalid according to its datatype 'http://www.w3.org/2001/XMLSchema:date' - The string '1982-10-37' is not a valid Date value.</FaultMessage> 
 
     </Fault> 
 
     <Fault> 
 
      <FaultCode>ERR04</FaultCode> 
 
      <FaultMessage>The element 'Abc001' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'Abc15' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage> 
 
     </Fault> 
 
    </Faults> 
 
</MessageResponse>

這是奇怪的部分。當我在「Abc001」元素的開始處引入一個更多的錯誤,並且還保留所有其他現有錯誤時,結果完全混亂。這裏是新引入的錯誤XML:

<?xml version="1.0" encoding="utf-8"?> 
 
<Message xmlns="http://www.somesite.com/somefolder/messages"> 
 
\t <Header> 
 
\t \t <MessageId>Lorem</MessageId> 
 
\t \t <MessageSource>Ipsum</MessageSource> 
 
\t </Header> 
 
\t <Body> 
 
\t \t <Abc001> 
 
\t \t \t <!--newly introduced error - removed the following element--> 
 
\t \t \t <!--<Abc002>dolor</Abc002>--> 
 
\t \t \t <Abc003>sit amet</Abc003> 
 
\t \t \t <!--The value for Abc004 is increased beyond the allowed 200 characters--> 
 
\t \t \t <Abc004>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</Abc004> 
 
\t \t \t <Abc005> 
 
\t \t \t \t <Abc006>1234</Abc006> 
 
\t \t \t \t <Abc007> 
 
\t \t \t \t \t <Abc008>Ut enim</Abc008> 
 
\t \t \t \t \t <ABC009>AD</ABC009> 
 
\t \t \t \t \t <!--<Abc010>minim</Abc010>--> 
 
\t \t \t \t </Abc007> 
 
\t \t \t \t <Abc011>1982-10-37</Abc011> 
 
\t \t \t \t <Abc012> 
 
\t \t \t \t \t <Abc013>veniam</Abc013> 
 
\t \t \t \t \t <Abc014>nostrud</Abc014> 
 
\t \t \t \t </Abc012> 
 
\t \t \t </Abc005> 
 
\t \t \t <!--the element below is not allowed--> 
 
\t \t \t <Abc15>Not allowed</Abc15> 
 
\t \t </Abc001> 
 
\t </Body> 
 
</Message>

最後,這裏是驗證結果:

<MessageResponse xmlns="http://www.somesite.com/somefolder/messages"> 
 
    <Result>false</Result> 
 
    <Status>Failed</Status> 
 
    <FaultCount>1</FaultCount> 
 
    <Faults> 
 
     <Fault> 
 
      <FaultCode>ERR01</FaultCode> 
 
      <FaultMessage>The element 'Abc001' in namespace 'http://www.somesite.com/somefolder/messages' has invalid child element 'Abc003' in namespace 'http://www.somesite.com/somefolder/messages'. List of possible elements expected: 'Abc002' in namespace 'http://www.somesite.com/somefolder/messages'.</FaultMessage> 
 
     </Fault> 
 
    </Faults> 
 
</MessageResponse>

這裏是我的C#代碼我正在使用來驗證:

public async Task<IMIDPreValidationAckMessage> ValidateXmlMessage(XDocument doc) 
    { 
     var result = new PreValidationAckMessage(); 
     result.Result = true; 
     result.Status = "Succeeded"; 

     var xsd = HttpContext.Current.Server.MapPath("~/message01.xsd"); 

     try 
     { 
      var uri = new System.Uri(xsd); 

      var localPath = uri.LocalPath; 

      var docNameSpace = doc.Root.Name.Namespace.NamespaceName; 

      XmlSchemaSet schemas = new XmlSchemaSet(); 
      schemas.Add(docNameSpace, localPath); 

      XmlReaderSettings xrs = new XmlReaderSettings(); 
      xrs.ValidationType = ValidationType.Schema; 
      xrs.ValidationFlags |= XmlSchemaValidationFlags.ReportValidationWarnings; 
      xrs.Schemas = schemas; 

      result.XSDNamespace = doc.Root.GetDefaultNamespace().NamespaceName; 
      var errCode = 1; 

      xrs.ValidationEventHandler += (s, e) => 
      { 
       var msg = e.Message; 
       result.Result = false; 
       result.Status = "Failed"; 
       result.FaultCount++; 
       result.Faults.Add(new Fault 
       { 
        FaultCode = "ERR" + errCode++.ToString().PadLeft(2, '0'), 
        FaultMessage = e.Message 
       }); 
      }; 

      using (XmlReader xr = XmlReader.Create(doc.CreateReader(), xrs)) 
      { 
       while (xr.Read()) { } 
      } 
     } 
     catch (System.Exception ex) 
     { 
      result.Result = false; 
      result.Status = "Unknown Error"; 
     } 
     return result; 
    } 

有人能告訴我這裏有什麼問題嗎?

+0

將所有類都包含在最後一個片段中,這樣才能複製粘貼並運行它。 – Evk

+0

@Evk:感謝您的快速回復。我沒有發佈的代碼是嵌套類,而且它們在這裏真的不相關。如果您只是將錯誤消息添加到驗證事件處理程序內的字符串列表中,則它應該足以進行測試。我的代碼只是收集錯誤消息並從中創建另一個XML文檔。就這樣。 –

回答

1

看來,XmlReader停止首次遇到的錯誤元素驗證。這裏是一個鏈接的舊(過時)XmlValidatingReaderValidationEventHandler說明:

If an element reports a validation error, the rest of the content model for that element is not validated, however, its children are validated. The reader only reports the first error for a given element.

而且似乎它與常規XmlReader相同(雖然它的文檔沒有提到它明確)。

在第一個例子中,錯誤是在最裏面的元素(比如元素的無效文本值)或者最後一個子元素中,所以它們都被報告並且不會被跳過。但是在上一個示例中,您在根Abc001元素的開頭處引入了錯誤,因此跳過了其餘的Abc001內容以及所有錯誤。

+0

再次感謝您的快速回復。你說的有道理,儘管我的印象是驗證貫穿整個元素樹並報告所有錯誤。我會等一會兒再給別的反饋告訴我們。如果沒有收到其他解釋,我會將您標記爲已接受的答案。謝謝。 –

+0

您可以通過在樹的各個部分中引入錯誤來檢查此問題。一般而言,驗證不會停止在第一個錯誤上,只會在給定子樹(元素)中的第一個錯誤上停止。在你的最後一個例子中,如果你有多個'Abc001'元素 - 它只會跳過第一個元素(因爲它在開始時有錯誤),但會繼續到後續元素。如果在''元素後面引入錯誤 - 直到此時它纔會分析'Abc005'的內容。 – Evk

+0

我忘了回到這個問題,並標記答案。抱歉!由於沒有其他更好的反饋收到,我測試了你的建議,看起來你是對的。所以,我認爲這是被接受的答案。再次,對於延誤感到抱歉。 –