| |
XML is considered to be a subset of a
much older meta-language called, SGML; which itself
was a successor of a much older, GML.
SGML uses the idea of a Data Type Definition, or DTD, to describe
a 'vocabulary'.
When it came to XML, however, this was apparently found to be somewhat lacking.
Microsoft proposed the alternative of a, schema, which was more flexible in some ways,
and which was adopted by W3C (the international standards
body for just about everything on the internet).
An XML Schema, is not necessary.
But when default behavior is insufficient, and
one needs certain specifications for the data,
then it is created as a separate document, written in XML (and so, according to all
XML rules).
The schema is used to describe what's expected, allowed, in any XML document that uses that
schema. It's validation. It's saying that the t's are crossed, here, that the date
is formatted like so, there, that this element can be a child of that, and so on.
There's a lot to XML Schema.
But it's a validation document for any number of XML documents that are supposed
to conform to it.
The schema file is typically used with the .xsd suffix.
It won't be recognized by any Microsoft msxml prior to 4.0.
And it is referred to in an XML document as:
<?xml version="1.0"?>
<root xmlns="namespace1"
xmlns:xs1="http://www.w3.org/2001/XMLSchema-instance"
xs1:schemaLocation="namespace1 x.xsd">
|
A default namespace is given (xmlns="namespace1"),
in the root element (which doesn't have to
be called, "root"), and is referred to
by the schemaLocation.
That, in turn is found in the xs1 namespace, which is declared with the
specific uri string shown. So if
the separate schema document, the .xsd file, is called, x.xsd,
then any XML document referring to the schema, in this way, will have
to be valid according to that schema.
If you don't want to use a namespace, as here, there's
an alternative:
If using -- noNamespaceSchemaLocation
You would simply put -- xs1:noNamespaceSchemaLocation="x.xsd"
|
So, what might the schema, the x.xsd file, look like?
The bare bones of the schema might start with:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="nspace 1"
targetNamespace="nspace 1"
elementFormDefault="qualified">
</xs:schema>
|
So the schema, being an XML document, starts out with the standard XML declaration.
By custom the root element is xs:schema where the xmlns attribute is
used to declare both some namespace (which is duplicated
by a "targetNamespace" declaration), in this case just
a non-prefix default called here, "nspace 1", and a http://www.w3.org/2001/XMLSchema
namespace identified with the prefix, xs.
The elementFormDefault, if set to "qualified", means that all elements
in the schema must have an explicit namespace prefix, which again is customary.
What about the schema, itself:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="nspace 1"
targetNamespace="nspace 1"
elementFormDefault="qualified">
<xs:element name="person" type="xs:string" />
<xs:element name="data" type="PersonData" />
<!-- showing complex type -->
<xs:complexType name="PersonData">
<xs:sequence>
<xs:element name="occupation" type="xs:token" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
</xs:schema>
|
The simple element contains a type attribute.
In the first instance, xs:string. Person, has to be a string, in order to validate.
A "complex type" is set as the type for data.
PersonData includes it's own element.
It also shows something called, sequence, which indicates that
any multiple elements from the XML document must be listed in the sequence shown in the schema.
Sequence is one of various indicators (or 'compositors'),
some of which are used as XML elements, some
of which are used as attributes.
maxOccurs, is one of those indicators, which is used as an attribute.
There are many more aspects to .xsd schema.
They can easily become overly complicated.
One of the links, below, addresses this and possible remedies.
|