开发者

dateTime complaining about whiteSpace in XSD validation (lxml)

开发者 https://www.devze.com 2023-02-03 08:11 出处:网络
I\'m trying to validate a document using XSD, and lxml is complaining about whiteSpace in a dateTime value (though it should be collapsing it). I\'m not sure if this is a broken behavior, or if I\'m j

I'm trying to validate a document using XSD, and lxml is complaining about whiteSpace in a dateTime value (though it should be collapsing it). I'm not sure if this is a broken behavior, or if I'm just specifying something wrong within the XSD. Spent an hour trying to debug this, so hoping someone else has experienced similar behavior before.

======================================================================
ERROR [0.076s]: test_exports (disqus.importer.tests.tests.SchemaValidation)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dcramer/Development/disqus/disqus/importer/tests/tests.py", line 1098, in test_exports
    xsd.assertValid(export)
  File "lxml.etree.pyx", line 2659, in lxml.etree._Validator.assertValid (src/lxml/lxml.etree.c:99498)
DocumentInvalid: Element '{http://disqus.com}createdAt': '
      2008-06-10T01:32:08
    ' is not a valid value of the atomic type 'xs:dateTime'., line 8

Sample XML:

<?xml version="1.0" encoding="utf-8"?>
<disqus xmlns="http://disqus.com" xmlns:dsq="http://disqus.com/disqus-internals" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://disqus.com/api/schemas/1.0/disqus.xsd http://disqus.com/api/schemas/1.0/disqus-internals.xsd">
  <post dsq:id="1">
    <id />
    <message>
      <![CDATA["We want happy paintings. Happy paintings. If you want sad things, watch the news."]]>
    </message>
    <createdAt>
      2008-06-10T01:32:08
    </createdAt>
    <author>
      <email>
        bob@ross.com
      </email>
      <name>
        bobross
      </name>
      <isAnonymous>
        true
      </isAnonymous>
      <username>
        bobross
      </username>
    </author>
    <ipAddress>
      127.0.0.1
    </ipAddress>
    <thread dsq:id="1"/>
  </post>
</disqus>

disqus.xsd:

<?xml version="1.0"?>
<xs:schema targetNamespace="http://disqus.com"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:dsq="http://disqus.com/disqus-internals"
           xmlns="http://disqus.com"
           elementFormDefault="qualified"
>
  <!-- import the dsq namespace -->
  <xs:import namespace="http://disqus.com/disqus-internals"
             schemaLocation="internals.xsd"/>

  <!-- misc types -->
  <xs:simpleType name="identifier">
    <xs:restriction base="xs:string">
      <xs:maxLength value="200"/>
    </xs:restriction>
  </xs:simpleType>

  <!-- root disqus element -->
  <xs:element name="disqus">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="category" type="category" minOccurs="0" maxOccurs="unbounded"/>
        <xs:element name="thread" type="thread" minOccurs="0" maxOccurs="unbounded"/>
        <xs:element name="post" type="post" minOccurs="0" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <!-- category element -->
  <xs:complexType name="category">
    <xs:all minOccurs="0">
      <xs:element name="forum" type="xs:string">
        <xs:unique name="categoryID">
          <xs:selector xpath="category"/>
          <xs:field xpath="@title"/>
        </xs:unique>
      </xs:element>
      <xs:element name="title" type="xs:string"/>
    </xs:all>
    <xs:attribute ref="dsq:id"/>
  </xs:complexType>

  <!-- thread element -->
  <xs:complexType name="thread">
    <xs:all minOccurs="0">
      <xs:element name="id" type="identifier" minOccurs="0">
        <xs:unique name="threadID">
          <xs:selector xpath="thread"/>
          <xs:field xpath="@id"/>
        </xs:unique>
      </xs:element>
      <xs:element name="forum" type="xs:string"/>
      <xs:element name="category">
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="xs:string">
              <xs:attribute ref="dsq:id"/>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="link" type="xs:anyURI"/>
      <xs:element name="title" type="xs:string"/>
      <xs:element name="message" type="xs:string" minOccurs="0"/>
      <xs:element name="author" type="author" minOccurs="0"/>
      <xs:element name="createdAt" type="xs:dateTime"/>
      <xs:element name="isClosed" type="xs:boolean" default="false" minOccu开发者_如何学编程rs="0"/>
      <xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/>
    </xs:all>
    <xs:attribute ref="dsq:id"/>
  </xs:complexType>

  <!-- post element -->
  <xs:complexType name="post">
    <xs:all minOccurs="0">
      <xs:element name="id" type="identifier" minOccurs="0">
        <xs:unique name="postID">
          <xs:selector xpath="post"/>
          <xs:field xpath="@id"/>
        </xs:unique>
      </xs:element>
      <xs:element name="parent" minOccurs="0">
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="identifier">
              <xs:attribute ref="dsq:id"/>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="thread">
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="identifier">
              <xs:attribute ref="dsq:id"/>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="author" type="author" minOccurs="0"/>
      <xs:element name="message" type="xs:string"/>
      <xs:element name="ipAddress" type="xs:string" minOccurs="0"/>
      <xs:element name="createdAt" type="xs:dateTime"/>

      <!-- post boolean states states -->
      <xs:element name="isDeleted" type="xs:boolean" default="false" minOccurs="0"/>
      <xs:element name="isApproved" type="xs:boolean" default="true" minOccurs="0"/>
      <xs:element name="isFlagged" type="xs:boolean" default="false" minOccurs="0"/>
      <xs:element name="isSpam" type="xs:boolean" default="false" minOccurs="0"/>
      <xs:element name="isHighlighted" type="xs:boolean" default="false" minOccurs="0"/>
    </xs:all>
    <xs:attribute ref="dsq:id"/>
  </xs:complexType>

  <!-- author element -->
  <xs:complexType name="author">
    <xs:all minOccurs="0">
      <xs:element name="name" type="xs:string"/>
      <xs:element name="email" type="xs:string"/>
      <xs:element name="link" type="xs:anyURI" minOccurs="0"/>
      <xs:element name="username" type="xs:string" minOccurs="0"/>
      <xs:element name="isAnonymous" type="xs:boolean" default="true" minOccurs="0"/>
    </xs:all>
    <xs:attribute ref="dsq:id"/>
  </xs:complexType>
</xs:schema>


It looks like the whitespace is causing the problems. Can you removed the leading and trailing whitespace from createdAt so it becomes

<createdAt>2008-06-10T01:32:08</createdAt>

and see what happens? If that solves it and you created the XML then change the XML generation so it doesn't have the whitespace. Otherwise, if you are in charge of the schema then try changing xsd:whitespace to "collapse" and see if that fixes it.

The other possibility is that it may need the timezone. It's supposed to match [-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm] so the timezone is optional, but try putting a 'Z' in there to see if that fixes things. That's what this post suggests.

0

精彩评论

暂无评论...
验证码 换一张
取 消