Untitled Document



 Introduction to XML
 Data Definition and Data  Modeling
 Well Formed Document
 Well - Formed  Documents
   Parsers
 XML Processing -  Attribute Values
   XML Processing
   Event-Driven Parsers
 Tree-based Parsers
 Document Type  Definitions
 
 Document Type  Definitions (DTDs)
   Document validation
 General Principles in  writing DTDs
 Internal & External  subsets
   Standalone attribute
   DOCTYPE Declaration
 Internal DTD Subset  Declarations
   External DTDs
 Basic Markup  Declarations
 Formal DTD Structure - Entities
   Predefined Entities
   General Entities
   Parameter Entities
 Formal DTD Structure - Elements
   Content Model
   Cardinality Operators
   Attributes
   Default Values
   Attribute Types
   CDATA
 ID
 Data Modeling
 
   Data Modeling
   Information Modeling
 Static and Dynamic  Models
 Static Information  Model
   Organizing Things
   Finding Relationships
   Defining Properties
   Dynamic Modeling
 Dynamic Model  Techniques
 Designing XML  Documents
   XML for Messages
 XML for Persistent  Data
 Mapping the  Information Model to  XML
 Schema Languages  and Notations
 Document Object Model
 
 Document Object  Model
 XML Document  Structure
   Why use DOM?
 The DOM  Specification
 DOM Level2  Specification
   Working with DOM
 Client Side and  Server Side DOM
 Namespaces and  Schemas
 Linking and Querying

 Ecommerce Application  using XML

Copyrights : Layout Galaxy All Rights Reserved
No part of this tutorial may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, electrostatic, magnetic tape, mechanical or otherwise, without prior permission in writing from Layout Galaxy.




 Data Definitions and Data Modeling > Data Modeling

  Designing XML Documents

While discussing data flow models, we saw that there are two kinds of data in the system, data stores and message flows. XML is useful for both kinds of data, but the design considerations are rather different one is the XML for messages and XML for persistent data.


  XML for Messages

Using of XML for messages in systems poses less design problems than it for persistent data.

This is mainly because each message is fairly self-contained, and the question of what to include in a message usually falls out naturally from the process model. The term message is usually used in a very general sense, which might be an EDI-style message sent between organizations to represent a transaction.

There are some general rules that are to be applied to all XML messages whatever their precise role might be.

The design must reflect the information and not the intended use. This means that the use of the information may change over time, whereas the information content is more likely to remain stable. This applies particularly to presentation details.

The design must foresee change. The design of XML itself gives an advantage to this area, by avoiding traditional drawbacks such as fixed sized fields and fixed column ordering. But the document designer also has the responsibility to structure information in a way that foresees change.

Instead of inventing a message, it is better to use a standard message if one is present. Such increasing range of standardized messages is available, for example, from the Biztalk initiative http://www.biztalk.org.

The data encoding must be as close as to the natural coding so as to achieve within performance constraints.

  XML for Persistent Data

The dynamic information model determines the design of messages. By contrast, for persistent data, it is the static model that is important.

The first thing that is to be decided is the vastness of the document. The most difficult part of the design is to decide what the granularity of data should be and what needs to get into the document? There are some applications where it makes sense to have a single XML document run into gigabytes of data. In such a case it will be necessary to parse the whole document, which might take hours. On the opposite extreme, having a large number of documents is usually not ideal either.

When document XML persistent data is used, finding information will always be a two-part operation. First find the right document, then the facts interested in the information.

To locate the right document there are four options.

First, use the directory structure in the operating system to locate the documents.

Second, index the documents from each other, like in a traditional web site where documents are always found by following links, but typically in a more structured manner.

Third, index the documents from a relational database. In this case, we have the choice of holding the XML documents in files referenced from the database, or holding them in the database itself.

Fourth, index the documents using a free-text search engine. A large number of search engines offer native support to XML.

Another option would to use the so-called XML server. An XML server not only holds the XML data in a raw unparsed form, but in the form of a persistent DOM, that is, it stores the nodes of the Document Object Model as objects in an object database.

  Mapping the Information Model to XML

This basically deals with how to map the different parts of the information model to an XML document structure. One of the ways is through representation of object types. Generally, an object type in the information model will translate into an element type in XML structure. We can use the name of the object type as the element name, or even abbreviate it.

Most people use short names as their elements not to save space, but because XML seems to be more specific, readable that way, perhaps to avoid the tags distracting too much from the content. The advantage of using specific type is that the DTD can define more precisely exactly what attributes and child elements are associated with this element.

Nested elements in the XML document structure can used to represent some of the relationships in the model. The obvious ones to represent this way are the "contains" relationships.

There are several ways to represent a link from one element to another in XML. We can use ID, IDREF attributes, Xpointer references which are equivalent to the HREF tag in HTML. We can also use application-defined primary keys and foreign keys in XML documents.

All the three approaches have their own merit. The main advantage of using ID, IDREF is that the validation is done by the XML parser.

Xpointer references are much more flexible than ID, IDREF but they are not yet fully standardized.

The option of handling relationships through primary and foreign key is a perfectly viable approach, but the XML parser does not give any help in this matter.

When we have identified a property in the information model, a dilemma arises whether we represent it in the XML document using an XML attribute or using a nested element. In this case, there are no rules and we are free to choose the way we want either using an attribute or using a nested element. The table gives the pros and cons of each approach.

 
Advantages
Disadvantages
XML Attributes DTD can constrain the values; useful when there is a small set of allowed values, such as "yes" or "no". Simple string values. No support for metadata (or attributes of attributes).
  DTD can define a default value. Unordered.
  ID and IDREF Validation.  
  Lower soace overhead (makes a difference when sending gigabytes of data over the network).  
  Whitespace normalization available for certain data types that save application some parsing effort.  
  Easier to process DOM and SAX interfaces.  
Child elements Support arbitrarily complex values and repeating values. Slightly higher space usage. More complex programming.
  Ordered.  
  Support "attributes of attributes".  
  Extensible when data model changes.  

On representing the properties of an object using elements or attributes, we have to make a decision on how to encode their values.

Some of the common situations that are encountered are quantities such as height, width and weight, Yes/No values, dates and times, property names and binary data.

  Schema Languages and Notations

The concept of schema has been present in both the database and the document worlds for a long time. The formal role of a schema is to define the set of all possible valid documents, or in other words to define what constraints, beyond XML itself, the documents must meet for them to be more meaningful.

One purpose of a schema is to define the difference between a valid document and an invalid one.
The second purpose of a schema is to explain to the document the interpretation and usage of the constructs provided so that the sender and the recipient share a common understanding of the meaning of the message.

As a constraint language, DTDs are very limited. They provide some control over which of the elements can be nested within each other but say nothing about the text contained within elements.

Back Next


Copyrights : Layout Galaxy All Rights Reserved
No part of this tutorial may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, electrostatic, magnetic tape, mechanical or otherwise, without prior permission in writing from Layout Galaxy.




17, Vadsarvala Nivas, 65-A, J. Nehru Road, Mulund (W), Mumbai - 400 080 INDIA
Tel : 91-22-21645588, 91-22-21640585 Fax : 91-22-21641545
Email : ionline@vsnl.com
© Image Online 2001-2003