XML - SCHEMA

BASICS  |  DOM  |  NAMESPACE

 
   
Intro/SGML

 

XML is considered to be a subset of a much older meta-language called, SGML; which itself was a successor of a much older, GML. SGML uses the idea of a Data Type Definition, or DTD, to describe a 'vocabulary'. When it came to XML, however, this was apparently found to be somewhat lacking. Microsoft proposed the alternative of a, schema, which was more flexible in some ways, and which was adopted by W3C (the international standards body for just about everything on the internet).

 
Schema

 

An XML Schema, is not necessary. But when default behavior is insufficient, and one needs certain specifications for the data, then it is created as a separate document, written in XML (and so, according to all XML rules). The schema is used to describe what's expected, allowed, in any XML document that uses that schema. It's validation. It's saying that the t's are crossed, here, that the date is formatted like so, there, that this element can be a child of that, and so on. There's a lot to XML Schema. But it's a validation document for any number of XML documents that are supposed to conform to it.

The schema file is typically used with the .xsd suffix. It won't be recognized by any Microsoft msxml prior to 4.0. And it is referred to in an XML document as:

<?xml version="1.0"?>
 
<root xmlns="namespace1"
       xmlns:xs1="http://www.w3.org/2001/XMLSchema-instance"
       xs1:schemaLocation="namespace1 x.xsd">

 

A default namespace is given (xmlns="namespace1"), in the root element (which doesn't have to be called, "root"), and is referred to by the schemaLocation. That, in turn is found in the xs1 namespace, which is declared with the specific uri string shown. So if the separate schema document, the .xsd file, is called, x.xsd, then any XML document referring to the schema, in this way, will have to be valid according to that schema. If you don't want to use a namespace, as here, there's an alternative:

If using -- noNamespaceSchemaLocation
 
You would simply put --   xs1:noNamespaceSchemaLocation="x.xsd"

 

So, what might the schema, the x.xsd file, look like? The bare bones of the schema might start with:

<?xml version="1.0"?>
 
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
     xmlns="nspace 1"
     targetNamespace="nspace 1"
     elementFormDefault="qualified">
 
</xs:schema>

 

So the schema, being an XML document, starts out with the standard XML declaration. By custom the root element is xs:schema where the xmlns attribute is used to declare both some namespace (which is duplicated by a "targetNamespace" declaration), in this case just a non-prefix default called here, "nspace 1", and a http://www.w3.org/2001/XMLSchema namespace identified with the prefix, xs. The elementFormDefault, if set to "qualified", means that all elements in the schema must have an explicit namespace prefix, which again is customary.

What about the schema, itself:

<?xml version="1.0"?>
 
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
     xmlns="nspace 1"
     targetNamespace="nspace 1"
     elementFormDefault="qualified">
 
<xs:element name="person" type="xs:string" />
<xs:element name="data" type="PersonData" />
 
<!-- showing complex type -->
<xs:complexType name="PersonData">
   <xs:sequence>
      <xs:element name="occupation" type="xs:token" maxOccurs="unbounded" />
   </xs:sequence>
</xs:complexType>
 
</xs:schema>

 

The simple element contains a type attribute. In the first instance, xs:string. Person, has to be a string, in order to validate. A "complex type" is set as the type for data. PersonData includes it's own element. It also shows something called, sequence, which indicates that any multiple elements from the XML document must be listed in the sequence shown in the schema. Sequence is one of various indicators (or 'compositors'), some of which are used as XML elements, some of which are used as attributes. maxOccurs, is one of those indicators, which is used as an attribute.

There are many more aspects to .xsd schema. They can easily become overly complicated. One of the links, below, addresses this and possible remedies.


 
 More to read:

ZVON Clickable index for XML Schema
O'Reilly XML.com Intro to XML Schema
W3 Schools W3schools article on XML Schema
TopXML XSD Schema tricks and tips
XML for ASP.NET Simple online .NET Schema validator
MSDN MSDN online XML Schema reference
Microsoft Help Microsoft's original compiled help file for msxml 4.0, with Schema reference (click: download de SDK)
Microsoft Article MSDN article on how to avoid overly complex schema
(Robin) Cover Pages Cover's XML Schema links
W3C XML Schema WWW Consortium XML Schema Primer