Differences Between HTML and XHTML

Even though this is a CSS reference, we should spend some time talking about HTML and XHTML, because your choice of markup language will affect how CSS is applied in some instances. Moreover, in order to understand the variations in the way CSS is applied to HTML and XHTML, you need to grasp the fundamental differences between the two markup languages.

The most important difference between the two markup languages is that HyperText Markup Language, or HTML, is an application of SGML (Standard Generalized Markup Language),1 and allows an author to omit certain tags and use attribute minimization.2 The Extensible HyperText Markup Language, or XHTML, is an application of XML (Extensible Markup Language).3 It doesn’t permit the omission of any tags or the use of attribute minimization. However, it provides a shorthand notation for empty elements—for example, we could use <br/> instead of <br></br>—which HTML does not. A conforming XML document must be well formed, which, among other things, means that there must be an end tag for every start tag, and that nested tags must be closed in the right order.4 When an XML parser encounters an error relating to the document’s well-formedness, it must abort, whereas an HTML parser is expected to attempt to recover and continue.

There are three areas in which the differences between HTML and XHTML affect our use of CSS:

Note, though, that these differences apply only when an XHTML document is served as an application of XML; that is, with a MIME type of application/xhtml+xml, application/xml, or text/xml. An XHTML document served with a MIME type of text/html must be parsed and interpreted as HTML, so the HTML rules apply in this case. A style sheet written for an XHTML document being served with a MIME type of text/html may not work as intended if the document is then served with a MIME type of application/xhtml+xml. For more information about MIME types, make sure to read MIME Types.

This can be especially important when you’re serving XHTML documents as text/html. Unless you’re aware of the differences, you may create style sheets that won’t work as intended if the document’s served as real XHTML.

Where the terms “XHTML” and “XHTML document” appear in the remainder of this section, they refer to XHTML markup served with an XML MIME type. XHTML markup served as text/html is an HTML document as far as browsers are concerned.

Footnotes

1 More accurately, HTML has been an application of SGML since version 2.0.

2 Attribute minimization is an SGML feature that allows us to omit the attribute name and use only the value; for instance, we could use <input readonly> instead of <input readonly="readonly">.

3 XML is a subset of SGML.

4 An XML document can be well-formed without being valid. Only well-formedness is a formal requirement of XML. (Browsers use non-validating XML parsers, anyway.)

User-contributed notes

ID:
#7
Contributed:
by AutisticCuckoo
Date:
Tue, 22 Apr 2008 13:15:58 GMT

@DavidHammond: Yes, this is an SGML feature used in HTML and, in a limited form, by XML.

HTML uses one NET (null end tag) separator, the '/' character, that both terminates the start tag and replaces the end tag.

XML uses an extension of SGML and defines two separate separators. The NESTC (NET-enabling start tag close) is '/' and the NET (null end tag) is '>'. The XML specification says that the use of the NET syntax is restricted to empty elements, so '<div/This is an example>' is not allowed, but '<div/>' is.

ID:
#6
Contributed:
by DavidHammond
Date:
Mon, 04 Feb 2008 16:48:07 GMT

HTML technically *does* allow elements to be expressed in a shorthand form similar to XML. It is valid to express `<div></div>` as `<div//` in HTML, or to express `<br>` as `<br/`. This is called Null End Tag syntax (NET), and is the same type of SGML construct used for XML self-closing tags. However, most web browsers today don't support HTML's NET syntax (if they did, it would actually create problems in all of the XHTML content that's served as text/html). Nevertheless, it is a part of the current HTML standard and the HTML Validator accepts it as valid.

The HTML version of NET syntax is designed for the purpose of shortening `<div>This is an example.</div>` to `<div/This is an example./`. The XML version, however, has a different ending delimiter (`>`) and the "IMMEDNET" constraint which means the ending delimiter must immediately follow the tag closing delimiter, so it always looks like `<div/>`.

Related Products