DOM Core

This part of the reference documents DOM 1 Core and DOM 2 Core. We describe and give examples for every interface, property and method defined in those specifications, and document their support in modern desktop browsers.

For a basic introduction to the DOM please see What is the DOM? (coming soon!)

Generalization, specialization, and inheritance of DOM Interfaces

The following diagram shows how DOM interfaces inherit from one another:

As you can see, all interfaces which represent actual content in HTML or XML inherit from Node. The Node interface includes methods which apply to all or most types of node, such as appendChild and cloneNode, and properties which are equally generalized, such as firstChild and nodeName.

The DOM is designed with this principle of generalization and specialization in mind. Generalized properties and methods are made available by the uppermost interface appropriate, but more specialized properties and methods may be available to specific interfaces, that do a similar or the same job. For example, the nodeValue property can be used to read or set the value of an Attr node, but the Attr interface itself also provides a value property that does the same thing. In general it doesn't matter which you use, however there are specific instances where browser bugs may affect that choice (for example, in some versions of Safari the value property of an attribute is read-only, where nodeValue is read/write). Exceptions such as this are documented where applicable.

Within the Node interface, all interfaces which represent textual content inherit from CharacterData. As with Node, this interface contains properties and methods that are inherited by more specialized interfaces, such as length and substringData.

Note: In many cases JavaScript has more powerful native features

In real-world usage, some aspects of the DOM may be effectively redundant because the same functionality is available through native language features. For example, the functionality provided by substringData can also be achieved using JavaScript's substr or substring methods. In most cases it doesn't really matter which you use, but the native language features are generally a lot more powerful and flexible.

Data types in the DOM

The fundamental data type in the DOM is Node, and all interfaces which represent actual data in an HTML or XML document inherit from Node, such as Element, Attr (attribute) and Text.

The DOM also refers to some additional data types that are used for meta-data, or for the content of some types of Node. These data types are:

DOMString
A sequence of characters encoded in UTF-16. In JavaScript this is bound to the string type.
DOMTimeStamp
A number of milliseconds. In JavaScript this is represented by the Date object.
unsigned short
A short integer. In JavaScript this is bound to the number type.
unsigned long
A long integer. In JavaScript this is bound to the number type.
boolean
A value that is true or false. In JavaScript this is bound to the boolean type.

The DOM does not define a size limit for the DOMString type, however some implementations do:

Firefox
4K (4096 bytes)
Opera 9.0
32K (32768 bytes)
Safari 3
64K (65536 bytes)

This has a crucial effect on the ability to process and manipulate text, because browsers can only work with data up to their limit. The data and nodeValue of a Text node will only include that much data, and methods such as cloneNode will only be able to clone up to that limit.

Furthermore, there is no way to retrieve the additional data in browsers that have a limit. The DOM specification suggests to use the substringData method to do this, implying that the data is there but unreadable; however this is not the case — as far as the browser is concerned, there is no more data.

Whitespace in the DOM

Some implementations count whitespace in the DOM as Text nodes, while others do not. If we take a simple example:

<h2>Shopping list</h2>
<ul>
  <li>Beer</li>
  <li>More beer</li>
</ul>

In Firefox, that code would have the following node structure:

  • H2
    • #text (Shopping list)
  • #text (line-break)
  • UL
    • #text (line-break and tab)
    • LI
      • #text (Beer)
    • #text (line-break and tab)
    • LI
      • #text (More beer)
    • #text (line-break)

Whereas in Internet Explorer it would simply be:

  • H2
    • #text (Shopping list)
  • UL
    • LI
      • #text (Beer)
    • LI
      • #text (More beer)

This can raise complications when using properties that describe direct node relationships, such as firstChild or nextSibling. It may appear as though the first <li> is the firstChild of the <ul>, but as far as Firefox is concerned, the firstChild of the <ul> is a Text node containing whitespace.

There are several ways to deal with this:

Using collection-based references for which whitespace doesn't matter

We can get a reference to that list item using getElementsByTagName('li').item(0) rather than firstChild, then it won't matter what else is in between.

Iterating past whitespace nodes

If we know what kind of node we're looking for (or not looking for), we can iteratively handle those types of nodes:

var item = list.firstChild;
while(item.nodeName == '#text')
{
  item = item.nextSibling;
}
Removing the whitespace programatically

We can use a method to remove all whitespace-only Text nodes from the DOM before using it. Here's an example of such a function, written by Alex Vincent:

function cleanWhitespace(node)
{
  for (var i=0; i<node.childNodes.length; i++)
  {
    var child = node.childNodes[i];
    if(child.nodeType == 3 && !/\S/.test(child.nodeValue))
    {
      node.removeChild(child);
      i--;
    }
    if(child.nodeType == 1)
    {
      cleanWhitespace(child);
    }
  }
  return node;
}

Document type differences

Everything in the DOM Core reference is tested in three different document types:

HTML
HTML or XHTML served as text/html.
XHTML
XHTML served as application/xhtml+xml or equivalent.
XML
Any other flavour of XML, served as text/xml. Various test documents were used, including RSS, SVG and XUL.

However not all browsers support all types, specifically:

  • Internet Explorer does not support XHTML (when served correctly as application/xhtml+xml or equivalent) — when documents are served in this way IE will prompt for download, rather than displaying the page. So in this browser it's not possible to conduct tests on this kind of document.
  • Safari 1.3 and 2 don't properly support XHTML, and documents served this way are interpreted as text/html. Nevertheless Safari does behave differently on such documents, compared with documents that are served as text/html directly. Therefore the reference documents differences as though Safari supports XHTML fully.

In this Section

User-contributed notes

Related Products