Differences Between HTML and XHTML
Even though this is a CSS reference, we should spend some time talking about HTML and XHTML, because your choice of markup language will affect how CSS is applied in some instances. Moreover, in order to understand the variations in the way CSS is applied to HTML and XHTML, you need to grasp the fundamental differences between the two markup languages.
The most important difference between the two markup languages is that
HyperText Markup Language, or HTML, is an
application of SGML (Standard Generalized Markup Language),1
and allows an author to omit certain tags and use attribute
minimization.2 The Extensible HyperText Markup
Language, or XHTML, is an application of XML (Extensible Markup
Language).3 It doesn’t permit the omission
of any tags or the use of attribute minimization. However, it provides a
shorthand notation for empty elements—for example, we could use
<br/> instead of
<br></br>—which HTML does not. A conforming
XML document must be well formed, which, among other things, means that
there must be an end tag for every start tag, and that nested tags must be
closed in the right order.4 When an XML parser
encounters an error relating to the document’s well-formedness, it must
abort, whereas an HTML parser is expected to attempt to recover and
continue.
There are three areas in which the differences between HTML and XHTML affect our use of CSS:
Note, though, that these differences apply only when an XHTML document
is served as an application of XML; that is, with a MIME type of
application/xhtml+xml, application/xml,
or text/xml. An XHTML document served with a MIME type of
text/html must be parsed and interpreted as HTML, so the
HTML rules apply in this case. A style sheet written for an XHTML document
being served with a MIME type of text/html may not work
as intended if the document is then served with a MIME type of
application/xhtml+xml. For more information about MIME
types, make sure to read MIME Types.
This can be especially important when you’re serving XHTML documents as
text/html. Unless you’re aware of the differences, you
may create style sheets that won’t work as intended if the document’s
served as real XHTML.
Where the terms “XHTML” and “XHTML document” appear in the remainder of
this section, they refer to XHTML markup served with an XML MIME type.
XHTML markup served as text/html is an HTML document as
far as browsers are concerned.
In this Section
Footnotes
1 More accurately, HTML has been an application of SGML since version 2.0.
2 Attribute minimization is an SGML feature that allows us
to omit the attribute name and use only the value; for instance, we could
use <input readonly> instead of <input
readonly="readonly">.
3 XML is a subset of SGML.
4 An XML document can be well-formed without being valid. Only well-formedness is a formal requirement of XML. (Browsers use non-validating XML parsers, anyway.)
User-contributed notes
- ID:
- #7
- Date:
- Tue, 22 Apr 2008 13:15:58 GMT
- Status:
- This note has not yet been confirmed for accuracy and relevance.
@DavidHammond: Yes, this is an SGML feature used in HTML and, in a limited form, by XML.
HTML uses one NET (null end tag) separator, the '/' character, that both terminates the start tag and replaces the end tag.
XML uses an extension of SGML and defines two separate separators. The NESTC (NET-enabling start tag close) is '/' and the NET (null end tag) is '>'. The XML specification says that the use of the NET syntax is restricted to empty elements, so '<div/This is an example>' is not allowed, but '<div/>' is.
- ID:
- #6
- Date:
- Mon, 04 Feb 2008 16:48:07 GMT
- Status:
- This note has not yet been confirmed for accuracy and relevance.
HTML technically *does* allow elements to be expressed in a shorthand form similar to XML. It is valid to express `<div></div>` as `<div//` in HTML, or to express `<br>` as `<br/`. This is called Null End Tag syntax (NET), and is the same type of SGML construct used for XML self-closing tags. However, most web browsers today don't support HTML's NET syntax (if they did, it would actually create problems in all of the XHTML content that's served as text/html). Nevertheless, it is a part of the current HTML standard and the HTML Validator accepts it as valid.
The HTML version of NET syntax is designed for the purpose of shortening `<div>This is an example.</div>` to `<div/This is an example./`. The XML version, however, has a different ending delimiter (`>`) and the "IMMEDNET" constraint which means the ending delimiter must immediately follow the tag closing delimiter, so it always looks like `<div/>`.
- ID:
- #5
- Date:
- Mon, 04 Feb 2008 16:39:49 GMT
- Status:
- This note has not yet been confirmed for accuracy and relevance.
hannson: That is not accurate. Chris Wilson has shared his intention of supporting XHTML in some future version of MSIE.
From his post: "I made the decision to not try to support the MIME type in IE7 simply because I personally want XHTML to be successful in the long run. [...] I would much rather take the time to implement XHTML properly after IE 7, and have it be truly interoperable" - http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx
- ID:
- #4
- Date:
- Wed, 19 Dec 2007 15:51:12 GMT
- Status:
- This note has not yet been confirmed for accuracy and relevance.
I don't know if it's relevant but I believe MSIE has no intentions to support XHTML (as an XML) in future versions (IE8), can anyone confirm/deny it?
Add a note
To post a note on this topic, please log in with your SitePoint username and password. If you don't have an account yet, you can create a new account for free.
