Doctypes
The doctype declaration, which should be the first item to appear in the source markup of any web page, is an instruction to the web browser (or other user agent) that identifies the version of the markup language in which the page is written. It refers to a known Document Type Definition, or DTD for short. The DTD sets out the rules and grammar for that flavor of markup, enabling the browser to render the content accordingly.
The doctype contains a lot of information, none of which you will be likely to find yourself being tested on in a job interview, so don’t worry if it all seems too difficult to remember. Besides, most web authoring packages will insert a syntactically correct doctype for you anyway, so there’s little chance of you getting it wrong.
The doctype begins with the string
<!DOCTYPE, which should be written in uppercase:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
The next part, which reads html (for XHTML) or
HTML, refers to the name of the root element for the
document. This information is included for validation purposes, since the
DTD itself doesn’t say which element is the root element in the document
tree:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
The PUBLIC statement informs the browser that
the DTD is a publicly available resource. If you had decided that the
various flavors of HTML or XHTML were lacking in some way, and you wanted
to extend the language beyond the defined specifications, you could go to
the effort of creating a custom DTD. This would allow you to define custom
elements, and would enable your documents to validate according to that
DTD; in this case, you’d change the PUBLIC value to
SYSTEM. That said, I’ve never actually seen an author do
this—most people live within the limitations of the defined HTML/XHTML
specifications (or plug the gaps using Microformats). Here’s the PUBLIC
statement:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
The next section is known as the Public Identifier, and
provides information about the owner or guardian of the DTD—in this case,
the W3C. The Public Identifier, which is shown here, is not case
sensitive: it also includes the level of the language that the DTD refers
to (XHTML 1.0), and identifies the language of the DTD—not the
content of the web page, it’s important to note. This language is defined
as English, or EN for short. Authors should not change
this EN reference, regardless of the language contained
in the web page.
Note that if the doctype contains the keyword
SYSTEM, the Public Identifier section is omitted.
All of this information is highlighted in the short fragment below:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Finally, the doctype includes a URL, known as the System Identifier, which refers to the location of the DTD. If you want to really geek out, you can copy and paste the address into your web browser’s location bar and download a copy of the DTD, but be warned that it doesn’t make for light reading! Here’s the System Identifier:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Available Doctypes shows the doctypes available in the WC3 recommendations.
| Doctype | Description |
|---|---|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML
4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd"> |
HTML 4.01 Strict allows the inclusion of structural and
semantic markup, but not presentational or deprecated elements
such as font.
framesets are
not allowed. |
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01
Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd"> |
HTML 4.01 Transitional allows the use of structural and
semantic markup as well as presentational elements (such
as font), which
are deprecated in Strict. framesets are not
allowed. |
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01
Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd"> |
HTML 4.01 Frameset applies the same rules as HTML 4.01
Transitional, but allows the use of frameset
content. |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
XHTML 1.0 Strict, like HTML4.01 Strict, allows the use of
structural and semantic markup, but not presentational or
deprecated elements (such as font); framesets are not
allowed. Unlike HTML 4.01, the markup must be written as
well-formed XML. |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
XHTML 1.0 Transitional, like HTML4.01 Transitional, allows
the use of structural and semantic markup as well as
presentational elements (such as font), which are
deprecated in Strict; framesets are not
allowed. Unlike HTML 4.01, the markup must be written as
well-formed XML. |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Frameset//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> |
XHTML 1.0 Frameset applies the same rules as XHTML 1.0
Transitional, but also allows the use of frameset
content. |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML
1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> |
XHTML 1.1 is a reformulation of XHTML 1.0 Strict, thus most of the same rules apply. However, 1.1 allows for modularization, which means that you can add modules (for example, to provide Ruby support for Chinese, Japanese, and Korean characters). |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2
Final//EN"> |
HTML 3.2 is an archaic doctype that’s no longer recommended for use (it’s included here for information only). |
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML
3.0//EN"> |
HTML 3.0 is an archaic doctype that’s no longer recommended for use (it’s included here for information only). |
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0
Level 2//EN"> |
HTML 2.0 is an archaic doctype that’s no longer recommended for use (it’s included here for information only). Note that there are actually 12 variants of this old doctype, all of which can be found in RFC1866 (refer to section 9.6). |
Doctype Switching or Sniffing
Quirks Mode
In this mode, browsers violate normal web formatting specifications as a way to avoid the poor rendering (or “breaking,” to use the vernacular) of pages that have been written using practices that were commonplace in the late 1990s. The quirks differ from browser to browser. In Internet Explorer 6 and 7, the Quirks Mode displays the document as if it were being viewed in IE version 5.5. In other browsers, Quirks Mode contains a selection of deviations that are taken from Almost Standards mode (explained below).
Standards Mode
In this mode, browsers attempt to give conforming documents an exact treatment according to the specification (but this is still dependent on the extent to which the standards are implemented in a given browser).
Almost Standards Mode
Firefox, Safari, and Opera (version 7.5 and above) add a third mode, which is known as Almost Standards Mode. This mode implements the vertical sizing of table cells in a traditional fashion—not rigorously, as defined in the CSS2 specification. (Internet Explorer versions 6 and 7 don’t need an Almost Standards Mode, because they don’t implement the vertical sizing of table cells rigorously, according to the CSS2 specification, in their respective Standards Modes).
Depending on the doctype that’s defined, and the level of detail contained inside the doctype (for example, whether it does or doesn’t include a Public Identifier), different browsers trigger different modes from the list above. Doctype switching or sniffing refers to the task of swapping one doctype for another, or changing the level of detail in the doctype, in order to coax a browser to render in one of Quirks, Standards, or Almost Standards Modes.
An example of a situation in which doctype sniffing was put to use most frequently was to address rendering differences between Internet Explorer 6 and earlier versions of the browser, which calculated content widths differently when widths, padding, borders, and margins were applied in CSS. (This topic is not something we’ll cover in this HTML reference, but you can find out more in The Ultimate CSS reference.) In Internet Explorer 6, depending on the doctype defined, a different rendering mode, namely “the correct way,” or “the old IE way,” would be used to calculate these widths.
As an example, imagine that you specify the doctype as HTML 4.01 Strict, like so:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
In IE6, the doctype above will cause the browser to render in Standards Mode, which includes using the W3C method for box model calculations. However, you see an entirely different result if you use the following doctype:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
In this scenario, IE6 will use its old, incorrect, non-W3C method for making box model calculations.
Note that this is not the only difference between Quirks and Standards Mode—it’s just one example of the differences between the two modes (but one that caused a great deal of problems because of the disastrous effect it could have on page layout).
For a complete reference of how different browsers behave when different doctypes are provided, refer to the chart at the foot of Henri Sivonen’s article “Activating Browser Modes with Doctype.”
User-contributed notes
- ID:
- #11
- Date:
- Wed, 26 Mar 2008 12:41:48 GMT
'The PUBLIC statement informs the browser that the DTD is a publicly available resource.
...
The next section is known as the Public Identifier and provides information about who is the owner/guardian of the DTD, in this case the W3C.'
This isn't accurate. The 'PUBLIC' keyword and the string that follows constitute the Formal Public Identifier (FPI) for the DTD.
The last string, which is the URL for the DTD, is called a Formal System Identifier (FSI).
The FPI and the FSI are just two different ways of saying the same thing. Using both is redundant from an SGML point of view, but of course there's another aspect to it these days due to doctype sniffing.
- ID:
- #8
- Date:
- Thu, 20 Mar 2008 13:08:43 GMT
One sentence to the meaning of "transitional" should be added, please.
<q>Authors should use the Strict
DTD when possible, but may use the Transitional DTD when support
for presentation attribute and elements is required.</q>
quoted from http://www.w3.org/TR/html401/sgml/loosedtd.html
When should transition end if not now?
Add a note
To post a note on this topic, please log in with your SitePoint username and password. If you don't have an account yet, you can create a new account for free.
