|
Generically, we can refer to schema as metadata,
or data about data. Some of the schema efforts are not just
concerned with defining a vocabulary; they go beyond this
attempting to explain the relationships between certain types
of data.
Schemas refine DTDs by permitting more precision
in expressing some concepts in the vocabulary. Schemas use
a wholly different syntax than DTDs. They permit us to borrow
vocabulary from other schemas, thereby solving validation
problem. Overall, schemas are better answers to the problem
of specifying vocabularies.
The concept of a schema has been present
for many years in both the database and the document world.
Schema has been common for both database and document world.
The formal role of the schema is to define
the set of all possible valid documents. And to define what
constraints documents must meet, for them to be meaningful.
We need to be careful in using the word
validity schema. In the XML standard, being "Valid"
means something quiet specific. Informally, it means that
a document confirms to the rules in its DTD. A document is
said to be valid, if it satisfies all the constraints defined
by the information model.
As a constraint language, DTDs are very
limited. They provide some control over which elements can
be nested within each other, but say nothing about the text
contained within the elements. They offer slightly more control
over attributes, but even this is very limited, for example,
there is no way of saying that an attribute must be numeric.
It is the document itself that decides whether it is going
to reference a DTD or not, which DTD it is going to reference,
and whether it is going to override any of the declarations
in the DTD in its private internal subset.
| Schema as a
set of constraints |
One purpose of a schema is to define the
difference between a valid document and an invalid one. As
far as possible the rules should be expressed in such a way
that software can decide whether a document is valid or not.
For example, a rule for scientific journal that authors address
should include the city and country only, or that the abstract
must be in French.
There is a need for constraints for two
reasons Stylistic reasons and Processing reasons. The processing
reasons define the information requirements of the next stage
in the process (i.e.) handling the document.
There is a great temptation to use the ability
to impose rules thoughtlessly to make the system unnecessarily
rigid. Information systems have a bad reputation for inflexibility,
and the aim should be to use constraints sensibly to allow
the humans in the process the maximum scope for using their
intelligence.
The purpose of schema is to explain-to the
document, the interpretation and usage of the constructs provided.
This purpose facilitates a common understanding of the message
between the sender and the recipient.
In both the document and database traditions,
this role of a schema is only secondary though it is the more
important.
Schema is not properly understood by the
person who enters the data on the screen. As a result, the
user interprets the schema in different ways and hence attaches
various meanings to the data fields, though the structure
remains unchanged. Consequently, the system suffers from what
is called semantic drift.
Schemas will show you how dramatically things
will change between current practice of DTDs and future practice.
Consider the following DTD for naming a person.
<!ELEMENT Name (Honorific?, First, MI?,
Last, Suffix?)>
<!ELEMENT Honorific (#PCDATA)>
<!ELEMENT First (#PCDATA)>
<!ELEMENT MI (#PCDATA)>
<!ELEMENT Last (#PCDATA)>
<!ELEMENT Suffix (#PCDATA)>
We must minimally have first and last names,
but we may optionally have a middle initial, honorific (Mr.,
Ms., Dr., etc) and a suffix (Jr., III, etc.). When we use
DTD for doing this, we are constrained with the fact that
the DTD needs to be changed each time we want to have an element.
We cannot possibly have an element, which can be optional.
For performing such operations, we can use what can be called
as Schema enabled DTD, wherein we can have schema within a
DTD.
To start with, we can have a <Schema>
element as the root of the schema. Then we have an element
called Name, the name of which is set in the name attribute
of the <element> tag.
<Schema
>
<element name="Name">
<type>
<element
name="Honorific" type="string" minOccurs="0"
maxOccurs="1"/>
<element
name="First" type="string"/>
<element
name="MI" type="string" minOccurs="0"
maxOccurs="1"/>
<element
name="last" type="string"/>
<element
name="suffix" type="string" minOccurs="0"
maxOccurs="1"/>
</type>
</element>
</Schema>
<element name="Name">
So <element name="Name">
declares a <name> element.
We have used it in its simplest form here, but we should know
it can be given a name and enclose element declarations. In
such a form, it is suitable for reuse elsewhere, and specifies
the content model of the <Name>
element. Note how the element contained within <Name>
is declared. Since they are simple types, we can declare them
within the body of the <Name>
declaration without further elaboration.
Everything we can define with a DTD is accounted
for in the structures portion of XML Schemas. As XML Schemas
are written in XML syntax, structures refer to the XML constructs
that we can use to define our markup. This means that XML
Schemas are really just another application of XML.
The structures section of the XML specification
is the part where the elements and attributes for defining
schemas are set out. More importantly, the content model for
elements is described in this part. Content model clearly
specify the allowable internal structure of an element.
A schema consists of a preamble and zero
or more definitions and declaration.
The preamble is found within the root element,
Schema. This must include at least three pieces of information
attributes. The following are some of the most commonly used
information attributes.
| TargetNS |
contains the
namespace and URI of the schema which is being used. |
| version attribute |
is used to
specify the version of the schema. |
| xmlns
attribute |
provides the
namespace for the XML Schemas specification and
sometimes optionally. |
| finalDefault
and exactDefault |
provide
defaults for two types of extension. |
|
<?xml version="1.0"?>
<schema targetNS="http://myserver/myschema.xsd"
version="1.0"
xmlns="http://www.w3.org/2003/XMLSchema">
</schema>
The code snippet here, shows how schema
is used in XML with a few attributes. Here, the Schema is
residing on myserver, and is called myschema.xsd, .xsd being
the file extension for XML Schemas. The version attribute
specifies that the XML used in this schema is of version 1.
The default namespace declaration is the schema reference
to XML Schemas. This is a closed model schema, which means
that all documents conforming to this schema will be completely
defined by the schema and must not have any outside content.
| Attributes and
Attribute Groups |
Attribute declarations consist of an <attribute>
element, which must minimally include a name attribute.
The <attribute> element also has optional
cardinality attributes, minOccurs and maxOccurs, which are
used to indicate whether the attribute must appear, and if
so, how often.
Attribute declaration may have DEFAULT and
FIXED attributes. These function much like the IMPLIED and
FIXED keywords in DTDs. The value of the fixed attribute is
the value the attribute must always have. The value of the
default attribute is the value, which is assumed if the attribute
does not explicitly appear in an element within an XML document.
Here are a couple of sample attribute declarations.
<attribute name="simpleAttr"/>
<attribute name="seqenceNo" type="integer"
default="0"/>
Copyrights : Layout Galaxy All Rights Reserved
No part of this tutorial may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, electrostatic, magnetic tape, mechanical or otherwise, without prior permission in writing from Layout Galaxy.
|
|