XML Schema Definition


Definition

Like a DTD, an XML Schema defines what a given set of XML documents should look like

  • Written in XML which DTD are not
  • aAllow user to define a 'scope' for elements - local or global
  • Allows user to define datatypes for elements

Types

There are 2 different element types in XML Schema - Simple and Complex type

Simple element types are elements that contain only text

Complex element types are elements that contain:

  • Child elements
  • Attributes
  • Child elements and attributes
  • Attributes and content
  • Child elements and content
  • Child elements, attributes, and content

Simple Types

Defining a Simple Type in XSD's:

  • <xs:element name="elementName" type="xs:string"/>

Occurence Constructs

  • Rules taht can be added to simple elements that define a minimum and/or maximum number of times it can occur in the XML file
  • ... minOccurs="1" maxOccurs="unbounded"
  • When no occurence constructs provided, and element defaults to '1 and only 1' in an XSD

Predefined DataTypes

The grand-daddy of all schemas (the rule set that all schemas inherit - https://www.w3.org/2001/XMLSchema) defines a number of ready to use datatypes for simple element and attribute declarations. Some of them are:

  • xs:string - alphanumeric characters, symbols and spaces
  • xs:positiveDecimal - positive number which can include decimals
  • xs:boolean - true or false
  • xs:date - conatins a date
  • xs:time - contains a time
  • ...many others

Create an XSD

  1. Start with the XML declaration
    • <? xml version="1.0" ?>
  2. Declare the namespace for the grandaddy schema from w3 inside the xs:schema tag
    • <xs:schema xmlns:xs="https://www.w3.orf/2001/XMLSchema"
  3. Associate the XSD with the XML document as an attribute of the ROOT element in the XML doc (this is different than a DTD)
    • <rootElement xmlns:xsi="https://www.w3.orf/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="mySchema.xsd">

Complex Types

An element that can contain child elements and/or attributes is considered to be a complex type

There are 4 different types of complex types:

  • Element only - an element that can contain child elements, but no attributes or text
  • Empty - an element that can contain attributes, but never child elements or text
  • Mixed Content - an element that can contain attributes, child elements, and text content
  • Text only - an element that contains only text, and possibly attributes

Element Only Complex Type

An element that can contain child elements, but no attributes or text

If your XML looks like this:


<book>
  <title>Sidhartha</title>
</book>

The XML Schema should look like this:


<xs:element name="book">
  <xs:complexType>
      <xs:sequence>
        <xs:element name="title"/>
      </xs:sequence>    
  </xs:complexType>    
</xs:element>

However, nested elements need to be acounted for:


// If this is the XML...

<business>
  <name>Harvey's</name>
  <address>
    <street>1610 14th Ave NW</street>
    <postal_code>H4K5J3</postal_code>
  </address>
  <phone>(403)256-4567</phone>
</business>


// The XML Schema should look like this:

<xs:element name="business">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="address">
      // Notice how we drill down on the children before we move on
        <xs:complexType>
          <xs:sequence>
            <xs:element name="street" type="xs:string"/>
            <xs:element name="postal_code" type="xs:string"/>
          </xs:sequence>    
        </xs:complexType>    
      </xs:element>
      <xs:element name="phone" type="xs:string"/>
    </xs:sequence>    
  </xs:complexType>    
</xs:element>

The Sequence Tag

Determines the order in which child elements must display in the XML

Defined by <xs:sequence>

A sequence tag can contain other sequence tags

Legitimate for a sequence tag to contain only one element


Empty Complex Type

An element that can contain attributes, but never child elements or text

For situations where your XML looks like this:


<book isbnNumer="123456789" coverType="softcover"/>

The XML Schema should look like this:


<xs:element name="book">
  <xs:complexType>
    <xs:attribute name="isbnNumer" type="xs:integer" use="required"/>
    <xs:attribute name="coverType" type="xs:string" use="required"/>
  </xs:complexType>    
</xs:element>

Mixed Element Complex Type

An element that can contain attributes, child elements, and text content

For situations where your XML looks like this:


<book isbnNumer="123456789">
  Text content here
  <title>Sidhartha</title/>
</book/>

The XML Schema should look like this:


<xs:element name="book">
  <xs:complexType>
    <xs:attribute name="isbnNumer" type="xs:integer" use="required"/>
    <xs:attribute name="coverType" type="xs:string" use="required"/>
  </xs:complexType>    
</xs:element>

Text Only Content Type

An element that contains only text, and possibly attributes

For situations where your XML looks like this:


<book isbnNumer="123456789">
  The title of the book is "Chronicle of a Death Foretold"
</book/>

The XML Schema should look like this:


<xs:element name="book">
  <xs:complexType>
    <xs:simpleContent>
      <xs:extension base="xs:string">
        <xs:attribute name="isbnNumer" type="xs:integer" use="required"/>    
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>    
</xs:element>