2011-01-08 30 views
3

我想使用XSD驗證文檔,並且lxml在dateTime值中抱怨whiteSpace(儘管它應該摺疊它)。我不確定這是否是一種破壞的行爲,或者我只是在XSD中指定了錯誤。花了一個小時試圖調試這個,所以希望其他人以前經歷過類似的行爲。dateTime在XSD驗證(lxml)中抱怨whiteSpace

====================================================================== 
ERROR [0.076s]: test_exports (disqus.importer.tests.tests.SchemaValidation) 
---------------------------------------------------------------------- 
Traceback (most recent call last): 
    File "/Users/dcramer/Development/disqus/disqus/importer/tests/tests.py", line 1098, in test_exports 
    xsd.assertValid(export) 
    File "lxml.etree.pyx", line 2659, in lxml.etree._Validator.assertValid (src/lxml/lxml.etree.c:99498) 
DocumentInvalid: Element '{http://disqus.com}createdAt': ' 
     2008-06-10T01:32:08 
    ' is not a valid value of the atomic type 'xs:dateTime'., line 8 

示例XML:

<?xml version="1.0" encoding="utf-8"?> 
<disqus xmlns="http://disqus.com" xmlns:dsq="http://disqus.com/disqus-internals" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://disqus.com/api/schemas/1.0/disqus.xsd http://disqus.com/api/schemas/1.0/disqus-internals.xsd"> 
    <post dsq:id="1"> 
    <id /> 
    <message> 
     <![CDATA["We want happy paintings. Happy paintings. If you want sad things, watch the news."]]> 
    </message> 
    <createdAt> 
     2008-06-10T01:32:08 
    </createdAt> 
    <author> 
     <email> 
     [email protected] 
     </email> 
     <name> 
     bobross 
     </name> 
     <isAnonymous> 
     true 
     </isAnonymous> 
     <username> 
     bobross 
     </username> 
    </author> 
    <ipAddress> 
     127.0.0.1 
    </ipAddress> 
    <thread dsq:id="1"/> 
    </post> 
</disqus> 

disqus.xsd:

<?xml version="1.0"?> 
<xs:schema targetNamespace="http://disqus.com" 
      xmlns:xs="http://www.w3.org/2001/XMLSchema" 
      xmlns:dsq="http://disqus.com/disqus-internals" 
      xmlns="http://disqus.com" 
      elementFormDefault="qualified" 
> 
    <!-- import the dsq namespace --> 
    <xs:import namespace="http://disqus.com/disqus-internals" 
      schemaLocation="internals.xsd"/> 

    <!-- misc types --> 
    <xs:simpleType name="identifier"> 
    <xs:restriction base="xs:string"> 
     <xs:maxLength value="200"/> 
    </xs:restriction> 
    </xs:simpleType> 

    <!-- root disqus element --> 
    <xs:element name="disqus"> 
    <xs:complexType> 
     <xs:sequence> 
     <xs:element name="category" type="category" minOccurs="0" maxOccurs="unbounded"/> 
     <xs:element name="thread" type="thread" minOccurs="0" maxOccurs="unbounded"/> 
     <xs:element name="post" type="post" minOccurs="0" maxOccurs="unbounded"/> 
     </xs:sequence> 
    </xs:complexType> 
    </xs:element> 

    <!-- category element --> 
    <xs:complexType name="category"> 
    <xs:all minOccurs="0"> 
     <xs:element name="forum" type="xs:string"> 
     <xs:unique name="categoryID"> 
      <xs:selector xpath="category"/> 
      <xs:field xpath="@title"/> 
     </xs:unique> 
     </xs:element> 
     <xs:element name="title" type="xs:string"/> 
    </xs:all> 
    <xs:attribute ref="dsq:id"/> 
    </xs:complexType> 

    <!-- thread element --> 
    <xs:complexType name="thread"> 
    <xs:all minOccurs="0"> 
     <xs:element name="id" type="identifier" minOccurs="0"> 
     <xs:unique name="threadID"> 
      <xs:selector xpath="thread"/> 
      <xs:field xpath="@id"/> 
     </xs:unique> 
     </xs:element> 
     <xs:element name="forum" type="xs:string"/> 
     <xs:element name="category"> 
     <xs:complexType> 
      <xs:simpleContent> 
      <xs:extension base="xs:string"> 
       <xs:attribute ref="dsq:id"/> 
      </xs:extension> 
      </xs:simpleContent> 
     </xs:complexType> 
     </xs:element> 
     <xs:element name="link" type="xs:anyURI"/> 
     <xs:element name="title" type="xs:string"/> 
     <xs:element name="message" type="xs:string" minOccurs="0"/> 
     <xs:element name="author" type="author" minOccurs="0"/> 
     <xs:element name="createdAt" type="xs:dateTime"/> 
     <xs:element name="isClosed" type="xs:boolean" default="false" minOccurs="0"/> 
     <xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/> 
    </xs:all> 
    <xs:attribute ref="dsq:id"/> 
    </xs:complexType> 

    <!-- post element --> 
    <xs:complexType name="post"> 
    <xs:all minOccurs="0"> 
     <xs:element name="id" type="identifier" minOccurs="0"> 
     <xs:unique name="postID"> 
      <xs:selector xpath="post"/> 
      <xs:field xpath="@id"/> 
     </xs:unique> 
     </xs:element> 
     <xs:element name="parent" minOccurs="0"> 
     <xs:complexType> 
      <xs:simpleContent> 
      <xs:extension base="identifier"> 
       <xs:attribute ref="dsq:id"/> 
      </xs:extension> 
      </xs:simpleContent> 
     </xs:complexType> 
     </xs:element> 
     <xs:element name="thread"> 
     <xs:complexType> 
      <xs:simpleContent> 
      <xs:extension base="identifier"> 
       <xs:attribute ref="dsq:id"/> 
      </xs:extension> 
      </xs:simpleContent> 
     </xs:complexType> 
     </xs:element> 
     <xs:element name="author" type="author" minOccurs="0"/> 
     <xs:element name="message" type="xs:string"/> 
     <xs:element name="ipAddress" type="xs:string" minOccurs="0"/> 
     <xs:element name="createdAt" type="xs:dateTime"/> 

     <!-- post boolean states states --> 
     <xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/> 
     <xs:element name="isApproved" type="xs:boolean" default="true" minOccurs="0"/> 
     <xs:element name="isFlagged" type="xs:boolean" default="false" minOccurs="0"/> 
     <xs:element name="isSpam" type="xs:boolean" default="false" minOccurs="0"/> 
     <xs:element name="isHighlighted" type="xs:boolean" default="false" minOccurs="0"/> 
    </xs:all> 
    <xs:attribute ref="dsq:id"/> 
    </xs:complexType> 

    <!-- author element --> 
    <xs:complexType name="author"> 
    <xs:all minOccurs="0"> 
     <xs:element name="name" type="xs:string"/> 
     <xs:element name="email" type="xs:string"/> 
     <xs:element name="link" type="xs:anyURI" minOccurs="0"/> 
     <xs:element name="username" type="xs:string" minOccurs="0"/> 
     <xs:element name="isAnonymous" type="xs:boolean" default="true" minOccurs="0"/> 
    </xs:all> 
    <xs:attribute ref="dsq:id"/> 
    </xs:complexType> 
</xs:schema> 

回答

1

它看起來像空白導致的問題。你可以刪除createdAt中的前導和尾隨空白,所以它變成

<createdAt>2008-06-10T01:32:08</createdAt> 

看看會發生什麼?如果解決了這個問題並且您創建了XML,那麼更改XML生成,以便它沒有空格。否則,如果您負責架構,請嘗試更改xsd:whitespace以「摺疊」並查看是否修復該問題。

另一種可能是它可能需要時區。它應該與[-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]匹配,所以時區是可選的,但可以嘗試在其中放置一個'Z'來查看是否修復了一些事情。這就是this post的建議。

+0

空格正是導致它的原因(根據描述)。默認情況下,由XSD定義的dateTime類型具有whiteSpace摺疊(以及它是一個不能更改的固定值)。 – 2011-01-08 20:05:15