Document Object Model (XML)

What does DOM do?

At the cost of relatively high demands on memory and processing, it can make it much easier for programmers to go directly to parts of the complex XML data struture, without having to traverse the complex tree that makes up XML document.

The DOM aims to make it easy for programmers to access components and to delete, add, or edit their content, attributes and style. In essence, the DOM makes it possible for programmers to write applications that work properly on all browsers and servers and on all platforms. While programmers may need to use different programming languages, they do not need to change their programming model.^[2]

Once there, it can do such things as add or delete child nodes, modify remaining nodes, etc., without the programmer having to keep track of every detail of state.

When a document is in DOM structure, it becomes a hierarchy of nodes:

parent nodes with child nodes
leaf nodes that must be terminal points in the hierarchy.

If, during processing, a parent node gains or loses child nodes, the memory-resident DOM automatically changes the underlying document structure. In a database, of course, these changes would, at some point, need to be committed.

Alternatives

DOM's model is to load the entire XML document into memory, and then traverse the tree as needed. DOM is best when there will be multiple operations against the document, since the resources to load the document amortize over the number of manipulations required. It is far more efficient than SAX for incorporating results of processing back into the document.

The Simple API for XML Processing (SAX), may outperform DOM when the requirement is to make single passes through XML documents, looking for specific content and, perhaps, updating from an ordered source. Its low-level behavior makes it economical of processor and memory resources, but it puts considerable responsibility on the programmer to maintain state.

SAX always loads the entire document and takes one pass through it, which means it cannot backtrack as can DOM.

Application programming interfaces

To maintain the goal of DOM API compatibility across end user scripting languages and professional programming languages (of various vintages), the APIs need to work with a wide range of memory management systems, ranging from those (Java) with automatic garbage collection,^[3] to explicit memory management (C, C++).

DOM APIs are more likely to be true interfaces, in a procedural programming sense, than object classes. APIs can hide the details of memory organization in either fully object-oriented applications with their own class structure, or legacy applications that are not OO.

To be sure that the high-level APIs will be available, the language bindings developed by the DOM working Group (ECMAScript/JavaScript and Java) does not deal with memory management, but groups dealing with other language bindings will need to find their solutions.

C and C++ have a model of memory that is too different from that of DOM, for there to be direct bindings. One approach was taken by the Apache HTTP Server project's Xerces-C libraries support the DOM approach to XML parsing.^[4]. Apache's Xerces-C libraries usethe DOM approach for XML parsing.

References

[DOM-W3C-Home-1] {Jump up to: 1.0} ^1.1 Document Object Model (DOM)

[DOM-W3C-Activity-2] Document Object Model Activity Statement

[Java-XML-Sun-3] Package org.w3c.dom: Provides the interfaces for the Document Object Model (DOM) which is a component API of the Java API for XML Processing.

[Xerces-DOM-4] Parsing XML with Xerces-C C++ API

[1]

[2]

[3]

[4]

Document Object Model (XML)

Contents

What does DOM do?

Alternatives

Application programming interfaces

References

Navigation menu

Document Object Model (XML)

What does DOM do?

Alternatives

Application programming interfaces

References

Navigation menu

Search