Untitled Document



 Introduction to XML
 Data Definition and Data  Modeling
 Well Formed Document
 Well - Formed  Documents
   Parsers
 XML Processing -  Attribute Values
   XML Processing
   Event-Driven Parsers
 Tree-based Parsers
 Document Type  Definitions
 
 Document Type  Definitions (DTDs)
   Document validation
 General Principles in  writing DTDs
 Internal & External  subsets
   Standalone attribute
   DOCTYPE Declaration
 Internal DTD Subset  Declarations
   External DTDs
 Basic Markup  Declarations
 Formal DTD Structure - Entities
   Predefined Entities
   General Entities
   Parameter Entities
 Formal DTD Structure - Elements
   Content Model
   Cardinality Operators
   Attributes
   Default Values
   Attribute Types
   CDATA
 ID
 Data Modeling
 
   Data Modeling
   Information Modeling
 Static and Dynamic  Models
 Static Information  Model
   Organizing Things
   Finding Relationships
   Defining Properties
   Dynamic Modeling
 Dynamic Model  Techniques
 Designing XML  Documents
   XML for Messages
 XML for Persistent  Data
 Mapping the  Information Model to  XML
 Schema Languages  and Notations
 Document Object Model
 
 Document Object  Model
 XML Document  Structure
   Why use DOM?
 The DOM  Specification
 DOM Level2  Specification
   Working with DOM
 Client Side and  Server Side DOM
 Namespaces and  Schemas
 Linking and Querying

 Ecommerce Application  using XML

Copyrights : Layout Galaxy All Rights Reserved
No part of this tutorial may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, electrostatic, magnetic tape, mechanical or otherwise, without prior permission in writing from Layout Galaxy.




 Data Definitions and Data Modeling > Document Type Definitions

  Basic Markup Declarations

The content of an XML document is defined in terms of four kinds of markup declarations used in the DTD.

DTD Construct Meaning
ELEMENT Declaration of an XML element type.
ATTLIST Declaration of the attributes that may be assigned to a specific element type and the permissible values of those attributes.
ENTITY Declaration of a reusable content.
NOTATION Format declaration for external content not meant to be parsed and the external application that handles the content.

The keywords associated with these declarations and their meanings are as shown in the table. The first two declarations deal with the information found in an XML document element namely ELEMENTS and ATTRIBUTES. The last two types could be considered as supporting players. Entities in particular, are designed to make an XML vocabulary designer's life easier. They normally consist of content that recurs in the DTD or document to warrant creating a special declaration. Notations deal with content other than XML. A notation is used to declare a particular class of data and associate it with an external program. That external program becomes the handler for the declared class of data.

  Formal DTD Structure - Entities

XML provides a facility for declaring chunks of content and referencing them as many as many times as we like where they are needed, saving space and sparing document authors, a lot of typing. With the declaration of an entity in the DTD, we can define a name and the content it refers to. When needed, we can refer to it by name with a particular syntax that the name is an entity reference.

An entity used within the content of a document is called as a General Entity.
A parsed entity is XML document. The value of the entity is known as the replacement text. In contrast, an unparsed entity need not even be text. If it is text, it need not be XML. If the replacement content is not XML, there is no need in turning a parser on it. On the other hand, a parsed entity is XML that is pasted into the document content, so it must be passed through the parser.

  Predefined Entities

XML reserves some characters such as the angle brackets for its own use.

In addition, some characters are unprintable. XML therefore provides some predefined entities so that authors can use these characters in their documents without conflict. Hence, in the text content of an element, for example, certain characters can be referred to without using them and being confused with markup by the document processor at parse time.

Any character can be referred to by a numeric reference. This is done by writing the characters followed immediately by the numeric value of the character and a semicolon. So for example the greater than symbol could be written as >.

Some characters are so prevalent in XML that XML provides some predefined entities which are as shown in the table.

Character Entity Reference
< &lt;
> &gt;
& &amp;
' (apostrophe) &apos;
? (question mark) &quot;

  General Entities

It allows us to declare a piece of parsed text associated with a name by which we shall refer to the text. The entity is declared with the keyword ENTITY, a name and a replacement value. The figure shows an example of the usage.

<!ENTITY copyright "© Image Online, 2001-2003">
&copyright;

With this declaration in place, we can plug in the copyright text anywhere in a document's content when we need it simply by referring to the name "copyright". Of course, the parser needs to be told when we are making an entity reference so that it will not confuse the entity name with markup text. To signal this intent, we delimit the name with an ampersand in front of the name and a semicolon following. There cannot be a whitespace between the name and its delimiters.

It is to be noted that the ampersand character is reserved for this role in XML, if we need to use an ampersand for something else in a document, we must use the predefined entity for the character.

<!ENTITY Entity1 SYSTEM http://www.vvco.com/boilerplate/copyrighttext.txt>

General entities also have an external form, where the replacement text is given in an external file. The declaration takes the form as shown in the figure. The keyword SYSTEM is used to indicate an external source followed by the URL for the file.

Lastly, entities must not contain references to themselves, directly or indirectly.

  Parameter Entities

Parsed entities that are used solely within the DTD are called as Parameter Entities.

Parameter entities allow the user to easily reference or change commonly used constructs in the DTD by keeping them in one place.

This is easier than changing a construct everywhere as and when it appears in a DTD, but it still must be edited when a construct is extended.

The keyword CDATA refers to character data. The replacement text is a part of an attribute list declaration containing three common attributes. This is processed as if it had been written into the DTD. Whenever this set of attributes turns up in the DTD, we can simply refer to the entity peopleParameters.

All the parameter entities must be declared before they are referred to in the DTD.

This means that the parameter entity declared in the external subset of the DTD cannot be referred to in the internal subset as the latter is read first by the parser, thus, the reference will be seen before the declaration.

A parameter entity reference consists of the name delimited by a percent sign in front of the name and a semicolon following. There cannot be any whitespace between the delimiters and the name.

<!ATTLIST InsuredPerson
age CDATA # IMPLIED
weight CDATA #IMPLIED
height CDATA #REQUIRED
carrier CDATA #REQUIRED

Thus the reference for the example we had seen in the previous screen would be as shown. For the moment the InsuredPerson element is declared to have four attributes: one carrier, which is explicitly declared and the other three namely age, weight and height that appeared in the parameter entity and have already been declared when the replacement text is substituted for the entity reference by the parser.

The above example is thus equivalent to the figure as shown.

<ATTLIST InsuredPerson
%peopleParameters;
carrier CDATA #REQUIRED>

All the rules for well-formed documents apply to parameter entities. The document must be well-formed after the replacement text has been substituted for the entity reference.

Just as the case of general entities, parameter entities can also have replacement text that resides in an external file.

  Formal DTD Structure - Elements

Elements are the heart and soul of XML.

Element types are declared in DTDs using the ELEMENT tag. In addition to the keyword, the tag provides a name for the declared type and a content specification.

The element type names have some restrictions that apply to names throughout XML. Names may use letters, digits and punctuation marks colon, underscore, hyphen and period. Names may however not begin with a digit. They may only begin with a letter, underscore or colon.

The element content can be classified into four categories namely empty, element, mixed and any.

An empty element neither has text nor child elements contained in it. It may however have attributes. The empty element is denoted by the keyword EMPTY.

Element content is the condition where the element contains child elements but no text.

Mixed content as the name infers is a mix of elements and parsed character data (#PCDATA) or content.

Element and mixed are the two types where we can use structure to express meaning. Mixed and element content is indicated with a content model.

If we wish to leave the content of an element wide-open to any content that does not violate XML well- formed syntax, we declare it using the keyword ANY.

Back Next


Copyrights : Layout Galaxy All Rights Reserved
No part of this tutorial may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, electrostatic, magnetic tape, mechanical or otherwise, without prior permission in writing from Layout Galaxy.




17, Vadsarvala Nivas, 65-A, J. Nehru Road, Mulund (W), Mumbai - 400 080 INDIA
Tel : 91-22-25795588, 91-22-25780444 Fax : 91-22-25793397
Email : ionline@vsnl.com
© Image Online 2001-2003