开发者

XSD: XML files pass validation, but so do XSLs and XSDs

开发者 https://www.devze.com 2022-12-16 22:47 出处:网络
So there\'s an XSD schema that validates a data file. It declares root element of the document, and then go complexType\'s that describe structure. The schema has empty target namespace, document node

So there's an XSD schema that validates a data file. It declares root element of the document, and then go complexType's that describe structure. The schema has empty target namespace, document nodes are not supposed to be qualified with a namespace.

Recently someone by mistake sent an XSL template in place of an XML data file. That xsl passed validation no problem and was therefore directed to the XSLT processor. Result was basically the free-form text found in the validated XSL.

We then sent all sorts of XML documents to the validator (like, various XSD schemas and XSL templates), and they all passed validation.

We tried different ways of validation (XPathDocument.CheckValidity and XMLDocument.Validate), no difference.

What is happening anyway? Is our validation schema happy to pass any documents whose root nodes are qualified to a namespace different to what the schema d开发者_如何转开发escribes? How do we prevent that?

EDIT

Validation code (version 1):

Dim data As XPathDocument
....
If Not data.CreateNavigator.CheckValidity(ValidationSchemaSet, AddressOf vh.ValidationHandler) Then
    result = "Validation failed." & ControlChars.NewLine & String.Join(ControlChars.NewLine, vh.Messages.ToArray)
    Return False
End If

, where vh is:

Private Class VHandler
    Public Messages As New List(Of String)

    Public Sub ValidationHandler(ByVal sender As Object, ByVal e As ValidationEventArgs)
        If e.Severity = XmlSeverityType.Error Then
            Messages.Add(e.Message)
        End If
    End Sub
End Class

XSD schema:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:include schemaLocation="CarrierLabel_Type_1.xsd" />
  <xs:include schemaLocation="CarrierLabel_Type_2.xsd" />
  <xs:include schemaLocation="CarrierLabel_Type_3.xsd" />

  <!-- Schema definition -->
  <xs:element name="PrintJob" type="printJobType" />


  <!-- Types declaration -->
  <xs:simpleType name="nonEmptyString">
    <xs:restriction base="xs:string">
      <xs:minLength value="1"/>
    </xs:restriction>
  </xs:simpleType>

  <xs:complexType name="printJobType">
    <xs:sequence minOccurs="1" maxOccurs="unbounded">
      <xs:choice>
        <xs:element name="CarrierLabel_type_1" type="CarrierLabel_type_1" />
        <xs:element name="CarrierLabel_type_2" type="CarrierLabel_type_2" />
        <xs:element name="CarrierLabel_type_3" type="CarrierLabel_type_3" />
      </xs:choice>
    </xs:sequence>

    <xs:attribute name="printer" type="nonEmptyString" use="required" />
    <xs:attribute name="res" type="xs:positiveInteger" use="required" />
  </xs:complexType>

</xs:schema>

Should (and will) pass:

<?xml version='1.0' encoding='utf-8'?>
<PrintJob printer="printer_1" res="200">
  <CarrierLabel_type_1>
    <print_job_id>123456</print_job_id>
    <notes></notes>
    <labels_count>1</labels_count>
    <cases_indicator>2xCASE</cases_indicator>
  </CarrierLabel_type_1>
  <CarrierLabel_type_2>
    <next_location>Go there now!</next_location>
  </CarrierLabel_type_2>
</PrintJob>

Should not pass, but WILL PASS AS VALID DATA:

<?xml version='1.0' encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="text"/>

  <xsl:template match="WrongLabel">
    <xsl:param name="context"/>
    <xsl:param name="res"/>
    WRONG LABEL
  </xsl:template>

</xsl:stylesheet>


XML schemas really validate elements within a namespace, not documents. There's no XML Schema rule that says that the top-level element of the instance document must be within a specific namespace. This fits in with the general idea that a namespace is its own little world, and it prevents me from writing a schema in my namespace that will invalidate documents in yours. If an element's not in my namespace, it's none of my business

This means that when validating instance documents, you have to check to make sure that the top-level element of the document you're validating is in a namespace that your application accepts - which, in your application, is simply the default namespace.


Without having seen any code, I'm going to take a stab and suggest that it just may be because your validation is setting the ValidationType on the XmlReaderSettings object, but you're either not wiring up the ValidationEventHandler to check for validation errors or simply not doing anything with these validation events.

Even with XmlDocument.Validate, you need to wire up this ValidationEventHandler.

See MSDN here.


My understanding is that XML Schema (XSD) does not give any way of requiring that the root node of a document is a certain element -- the only way to do that is to restrict what elements are defined at "global level" to just one element. Is it possible that your validation code is importing the schema for XSLT, so that when it sees an XSLT document it validates because the XSLT elements have been defined at global level.


Right.

It turned out, validation has three possible results, not two -- valid, invalid and unknown. So Boolean return value of CheckValidity function is somewhat surprising.

If the root node of the document is not described by the schema, the document passes validation without errors, and no validation events occur, but the root node receives "unknown" status. This, for our purpose, is a fail. So we also need to check the XMLNode.SchemaInfo.Validity member of the root node.

I wish Validate() method documentation was a bit clearer on that.

0

精彩评论

暂无评论...
验证码 换一张
取 消