DOM Core
This part of the reference documents DOM 1 Core and DOM 2 Core. We describe and give examples for every interface, property and method defined in those specifications, and document their support in modern desktop browsers.
For a basic introduction to the DOM please see What is the DOM? (coming soon!)
Generalization, specialization, and inheritance of DOM Interfaces
The following diagram shows how DOM interfaces inherit from one another:
As you can see, all interfaces which represent actual content in HTML
or XML inherit from Node. The Node
interface includes methods which apply to all or most types of node, such
as appendChild and cloneNode, and properties which are equally
generalized, such as firstChild and nodeName.
The DOM is designed with this
principle of generalization and specialization in mind. Generalized
properties and methods are made available by the uppermost interface
appropriate, but more specialized properties and methods may be available
to specific interfaces, that do a similar or the same job. For example,
the nodeValue property can be used to read or set
the value of an Attr node, but the Attr interface itself also provides a value property that does the same thing. In general it
doesn't matter which you use, however there are specific instances where
browser bugs may affect that choice (for example, in some versions of
Safari the value property of an attribute is
read-only, where nodeValue is read/write).
Exceptions such as this are documented where applicable.
Within the
Node interface, all interfaces which represent textual
content inherit from CharacterData. As with Node, this interface contains properties and methods that
are inherited by more specialized interfaces, such as length and substringData.
In real-world usage, some aspects of the DOM may be effectively
redundant because the same functionality is available through native
language features. For example, the functionality provided by substringData can also be achieved using
JavaScript's substr or
substring methods. In most cases it doesn't
really matter which you use, but the native language features are
generally a lot more powerful and flexible.
Data types in the DOM
The
fundamental data type in the DOM is Node, and all
interfaces which represent actual data in an HTML or XML document inherit
from Node, such as Element, Attr (attribute) and Text.
The
DOM also refers to some additional data types that are used for meta-data,
or for the content of some types of Node. These data
types are:
- DOMString
- A sequence of characters encoded in UTF-16. In JavaScript this
is bound to the
stringtype. - DOMTimeStamp
- A number of milliseconds. In JavaScript this is represented by
the
Dateobject. - unsigned short
- A short integer. In JavaScript this is bound to the
numbertype. - unsigned long
- A long integer. In JavaScript this is bound to the
numbertype. - boolean
- A value that is true or false. In JavaScript this is bound to
the
booleantype.
The DOM does not define a size limit for the
DOMString type, however some implementations do:
- Firefox
- 4K (4096 bytes)
- Opera 9.0
- 32K (32768 bytes)
- Safari 3
- 64K (65536 bytes)
This has a crucial effect on the ability to process and
manipulate text, because browsers can only work with data up to their
limit. The data and nodeValue of a Text node will only include that much data, and methods such
as cloneNode will only be able to clone up to
that limit.
Furthermore, there is no way to retrieve the additional
data in browsers that have a limit. The DOM specification suggests to use
the substringData method to do this,
implying that the data is there but unreadable; however this is not
the case — as far as the browser is concerned, there is no more
data.
Whitespace in the DOM
Some implementations
count whitespace in the DOM as Text nodes, while
others do not. If we take a simple example:
<h2>Shopping list</h2> <ul> <li>Beer</li> <li>More beer</li> </ul>
In Firefox, that code would have the following node structure:
- H2
- #text (
Shopping list)
- #text (
- #text (line-break)
- UL
- #text (line-break and tab)
- LI
- #text (
Beer)
- #text (
- #text (line-break and tab)
- LI
- #text (
More beer)
- #text (
- #text (line-break)
Whereas in Internet Explorer it would simply be:
- H2
- #text (
Shopping list)
- #text (
- UL
- LI
- #text (
Beer)
- #text (
- LI
- #text (
More beer)
- #text (
- LI
This can raise complications when using properties that describe
direct node relationships, such as firstChild or
nextSibling. It may appear as though the first
<li> is the firstChild
of the <ul>, but as far as Firefox is concerned,
the firstChild of the
<ul> is a Text node
containing whitespace.
There are several ways to deal with this:
- Using collection-based references for which whitespace doesn't matter
We can get a reference to that list item using
getElementsByTagName('li').item(0)rather thanfirstChild, then it won't matter what else is in between.- Iterating past whitespace nodes
If we know what kind of node we're looking for (or not looking for), we can iteratively handle those types of nodes:
var item = list.firstChild; while(item.nodeName == '#text') { item = item.nextSibling; }- Removing the whitespace programatically
We can use a method to remove all whitespace-only
Textnodes from the DOM before using it. Here's an example of such a function, written by Alex Vincent:function cleanWhitespace(node) { for (var i=0; i<node.childNodes.length; i++) { var child = node.childNodes[i]; if(child.nodeType == 3 && !/\S/.test(child.nodeValue)) { node.removeChild(child); i--; } if(child.nodeType == 1) { cleanWhitespace(child); } } return node; }
Document type differences
Everything in the DOM Core reference is tested in three different document types:
- HTML
- HTML or XHTML served as text/html.
- XHTML
- XHTML served as application/xhtml+xml or equivalent.
- XML
- Any other flavour of XML, served as text/xml. Various test documents were used, including RSS, SVG and XUL.
However not all browsers support all types, specifically:
- Internet Explorer does not support XHTML (when served correctly as application/xhtml+xml or equivalent) — when documents are served in this way IE will prompt for download, rather than displaying the page. So in this browser it's not possible to conduct tests on this kind of document.
- Safari 1.3 and 2 don't properly support XHTML, and documents served this way are interpreted as text/html. Nevertheless Safari does behave differently on such documents, compared with documents that are served as text/html directly. Therefore the reference documents differences as though Safari supports XHTML fully.
In this Section
Attr
The Attr interface represents a single attribute of an Element node.CDATASection
The CDATASection interface represents a CDATA section in XML.CharacterData
The CharacterData interface provides a set of properties and methods for working with text content in the DOM.Comment
The Comment interface represents the contents of an XML or HTML comment.Document
The Document interface represents the whole document, such as an HTML page.DocumentFragment
The DocumentFragment interface is a lightweight version of the Document interface, intended for temporary storage of DOM structures.DocumentType
The DocumentType interface provides access to the attributes of a Document Type Declaration (DTD).DOMException
The DOMException interface represents a DOM processing error.DOMImplementation
TheDOMImplementationinterface provides methods for operations that are independent of any specific document or instance of the DOM.Element
TheElementinterface represents a single element in an HTML or XML document.Entity
The Entity interface represents an expanded entity, ie. the entity itself (&), not the entity reference (amp).EntityReference
The EntityReference interface represents an unexpanded entity reference, e.g.&NamedNodeMap
The NamedNodeMap represents an unordered collection of items, such as nodes or string values, indexed by name.Node
TheNodeinterface is the primary data type for the DOM, and represents any single item in the tree, such as anElementorTextnode.NodeList
The NodeList interface represents an ordered collection of nodes, indexed by number.Notation
The Notation interface represents a notation declared in the DTD.ProcessingInstruction
The ProcessingInstruction interface represents an XML processing instruction.Text
The Text interface represents the text content of an Element or Attr node.
Previous / Next page
- Previous
- Next
User-contributed notes
There are no comments yet.
Add a note
To post a note on this topic, please log in with your SitePoint username and password. If you don't have an account yet, you can create a new account for free.