If you’ve never looked at a Document Type Definition (DTD), you’ve missed one a web designer’s most interesting experiences. I’m only kidding a little bit. You can download several flavors of DTD from the W3C and read them for your edification.
You see a lot of abbreviations and not much explanation of what it all means. I’ll explain a few of the abbreviations for you. Take a look at the information in the XHTML1-transitional.dtd for the HTML element body
:
<!ELEMENT html (head, body)>
<!ATTLIST html
%i18n;
id ID #IMPLIED
xmlns %URI; #FIXED 'http://www.w3.org/1999/xhtml'
>
The two items in parentheses are elements that must be included. If you see a question mark after an element listed in parentheses, it means it may be included. If you see a plus sign, it means at least one of that element must be included.
ATTLIST
is attribute list. What follows is a list of attributes that this particular element can have. %i18n;
is an attribute related to internationalization and means that the element can be adapted to multiple locales. The first attribute is id
which is defined as ID
and #IMPLIED
. #IMPLIED
means the attribute is legal to include but not required. If it were required, it would say #REQUIRED
.
An example of a #REQUIRED
attribute would be the src
attribute for the element img
.
The next attribute you see is xmlns
(xml namespace) which is defined as %URI;.
Since this is preceded by a percent sign and followed by a semi-colon, the URI will be replaced by a declared value. In this particular case, the value is #FIXED 'http://www.w3.org/1999/xhtml'
. In most other situations, a URI would not be fixed.
Two other abbreviations you may see are CDATA
and PCDATA
. The first, CDATA
, means character data. In English, that means what ever string of letters you put there. For example, class CDATA #IMPLIED
, tells you that the class can have character data as a value. On the other hand, PCDATA
stands for parsed character data. This means not merely a string of characters, but some entities that may have to be escaped or interpreted by the parser (browser) to have special meaning. So you see things in a DTD like this: !ELEMENT script (#PCDATA)
.
Finally, you may see hyphens and zeros. For example, !ELEMENT UL - - (LI)+
. The hyphens (and/or zeros) travel in pairs and represent the requirements for a starting and ending tag. So - -
means both a starting and ending tag are required, while - 0
means a starting tag is required, but an ending tag is optional. So in the example, !ELEMENT UL - - (LI)+
, a ul
requires a starting and ending tag. But the br
element, !ELEMENT BR - O EMPTY
requires no ending tag.