|
We have learnt the ways to handle the structure
of an XML document and the ways to describe the hierarchical
information. We shall now discuss the ways to access the XML
document from the programs. One of these ways is through the Document Object Model.
The World Wide Web Consortium (W3C) specifies
that the DOM is a language, platform neutral definition, that
is, interfaces are defined for the different objects comprising
this DOM, but no specifics of implementation are provided,
and it could be done in any programming language.
The DOM layout a standard functionality
for document navigation and manipulation of the content and
structure of HTML and XML documents.
<INVOICE>
<CUSTOMER>
Sam </CUSTOMER>
<ADDRESS>57,
M.G.Road</ADDRESS>
<CITY>Bangalore</CITY>
<STATE>Karnataka</STATE>
<PRODUCT1>Cheese</PRODUCT1>
<UNITS>2</UNITS>
<PRODUCT2>Champaign</PRODUCT2>
<UNITS2>3</UNITS2>
<PRODUCT3>Gel</PRODUCT3>
<UNITS3>5</UNITS3>
<PRODUCT4>Bread</PRODUCT4>
<UNITS4>4</UNITS4> </INVOICE>
Developers new to XML, assume that the main
purpose of XML is to enable pieces of information in a file
to be named so that others may easily understand them. As
a result, documents prepared by beginners to XML, often resemble
"tag soup" - an unordered list of data elements
with meaningful tag names, but containing the same level of
information as a flat file. The ability of XML that many developers
overlook is its ability to show relationships between elements-specifically,
the ability to imply a parent-child relationship between two
elements. The above image as an example shows the preparation
of an XML document called as INVOICE that could be better
expressed in XML the way as shown in the following figure.
<INVOICE>
<CUSTOMER NAME = "Sam"
ADDRESS = "57, M.G. Road"
CITY = "Bangalore"
STATE = "Karnataka">
<LINEITEM PRODUCT = "Cheese"
UNITS = "2"/>
<LINEITEM PRODUCT = "Champaign"
UNITS = "3"/>
<LINEITEM PRODUCT = "Gel"
UNITS = "5"/>
<LINEITEM PRODUCT = "Bread"
UNITS = "4"/> </INVOICE>
In this document, it immediately becomes
apparent that the INVOICE element has four children, that
is, the line item elements. It also makes the search in the
document easier. If we are searching for the orders for CHEESE,
we can do so by looking for the LINEITEM elements with a PRODUCT
attribute value of CHEESE, instead of looking at the PRODUCT1
element, PRODUCT2 element and so on.
This document structure can be represented
as a node tree that shows all the elements and their relationships to one another.
With the DOM, we would be able to operate
on the document in the node form with its tree structure and
be able to add any information easily and attach it as a child
to the node than to read through the whole information and
go past the last item to insert new information and proceed
with the document as this would become tricky.
When the DOM is used to manipulate an XML
text file, the first thing it does is to parse the file, braking
the file out into individual elements, attributes, comments and so on.
The next thing it does is the creation,
in the memory, a representation of the XML file as a node
tree. The developer may access the contents of the document
through the node tree and make necessary modifications.
The DOM goes a step further and treats every
item as a NODE - elements, attributes, comments, processing
instructions and even the text make up an attribute.The DOM
provides a robust set of interfaces to facilitate the manipulation of the DOM node tree.
When accessing XML files, the DOM should
always be the access method of choice. Using the DOM has several
advantages over other available mechanisms for the generation
of XML documents such as writing directly to a stream.
Since the DOM transforms the text into an
abstract representation of a node tree, problems like unclosed
tags and improperly nested tags can be completely avoided.
When manipulating an XML document with a DOM, the developer
need not worry about the text expression of the document,
but only about parent-child relationships and associated information.
The node tree created by the DOM is a logical
representation of the content found in the XML file, it shows
what information is present and how is it related without
necessarily being bound to the XML grammar.
A developer using the DOM to change the
structure of an XML file will have a much simpler task than
one who is attempting to do so using traditional file manipulation mechanisms.
The way in which the DOM represents the
relationship between data elements is very similar to the
way that this information is represented in modern hierarchical
and relational databases. This makes it very easy to move
information between a database and an XML file using DOM.
As with any other Internet standards, the
DOM specification is maintained by the W3C. At present, the
W3C has prepared two documents - the Level 1 and Level 2 documents.
The W3C document for DOM Level 1 has a status
of Recommendation. This document contains two main sections.
The Document Object Model (Core) Level 1 contains the specification
for interfaces that can access any structured document, with
some specific extensions that allows access to XML documents.
The second section explains the HTML-specific
extensions to DOM.
The DOM specification explains how strings
are to be manipulated by the DOM by defining the data type DOMString.
The DOMString data type is defined as a double-byte
character set string, encoded using the UTF-16 encoding scheme.
We will now take a look at the objects,
methods and properties that make up the DOM Level 1 specification.
The behavior that is specified applies only to XML documents;
the DOM may behave differently when used to access HTML documents.
As of press time, the W3C DOM level 2 specification
has the status of Candidate Recommendation. In addition to
the objects we just discussed, the DOM 2 specification includes
support for namespaces, style sheets, filtering, event model and Ranges.
Namespaces are used to distinguish discrete
data elements with the same name in XML. The DOM Level 2 provides
mechanisms for interrogating and modifying the namespace for a document.
The DOM Level 2 specification includes an
object model for style sheets, as well as methods to query
and manipulate the style sheet for a particular object.
The DOM Level 2 specification adds methods
for filtering the content in an XML document.
An event model is in the planning stages
as far as the DOM Level 2 specification is concerned.
This includes the functions for manipulating
large blocks of text that will be useful to those working
with traditional documents in XML.
We have discussed the structure of the DOM,
taking XML documents and transforming them into node trees
that may be accessed programmatically. We have also talked
about how specification is provided by W3C, and that, it is
only a description of access mechanisms, but not about any
particular implementation. How do we take this information
and implement it? This can be done using the DOM API.
When writing a piece of software, that accesses
XML files using the DOM, a particular implementation of the
DOM must be used. The implementation, the DOM API, is a library
of some kind, designed to be run on a particular hardware
and software platform and to access a particular data store.
API is the acronym for Application Programming
Interface. It is actually a set of libraries that are used
by a component to instruct another component to carry out
lower level services. As such, API must be an implementation
of an interface with the appropriate code to connect to other
components and instruct them to carry out their functions.
As already seen, the W3C DOM specification
only provides the interface definition for the DOM libraries,
not the specifics of their implementation. It therefore falls
into the hands of third parties to provide implementations
of the DOM that may be used by programmers.
| Client Side and Server Side DOM |
There are many applications for the DOM
and XML, they can loosely be classified into two types: those
deployed on the server and those deployed on the client.
As the Internet developers exercise much
control over the software deployed to their server, the first
applications of DOM have typically been on the server side.
The DOM can be used to simplify to a large extent data interchange
between disparate business systems, as well as providing an
ideal mechanism for the archiving and retrieval of data.
The first application of XML is to facilitate
inter-process or inter-business communications. This is mainly
because of the advantages of XML such as platform-independent,
self-describing and hierarchical information.
XML is an ideal storage medium for archived
information, especially if it comes from an object-oriented
or hierarchical database.
This scenario is yet to develop because
as of now, only Internet Explore 5.0 comes with DOM functionality
built-ins. Netscape and other browser developers are in the
process of adding DOM support to their systems. Once they
are in use, Internet developers will be able to take advantage
of the DOM on the client to improve the way; information is
rendered, and decreases roundtrips to the server.
Copyrights : Layout Galaxy All Rights Reserved
No part of this tutorial may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, electrostatic, magnetic tape, mechanical or otherwise, without prior permission in writing from Layout Galaxy.
|
|