Apr 26, 2008

DTD DOCTYPE and ENTITY Keywords

DTD DOCTYPE and ENTITY Keywords

DOCTYPE

This is the keyword that specifies the DTD to be used by the document. It is normally included on the first line of the HTML or markup language document. It is used to assign a name to a set of element declarations which are part of the DTD. An example from a typical HTML page is:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

The second word, "HTML", indicates the name of the root document in the DTD. The HTML element must be the root element in the DTD document at "http://www.w3.org/TR/html4/loose.dtd".

ENTITY

This is basically a name definition to be used in the DTD. This keyword may be used to refer to an external file or set a string value for a name that, when used in the DTD, the string value is substituted for the name. The word ENTITY is used by itself here rather than in the attribute list with the ATTLIST keyword as outline in the ATTLIST section of this document. The entity is the same as a string name in most programs. There are two main types of entities:

  • General - After being declared, they are referenced using the & sign such as "&entityname".
  • Parameter - After being declared, they are referenced using the % sign such as "%entityname".

General Internal Parsed Entity

Format:

<!ENTITY Name "value">

External Parsed Entity

The format of a general external parsed entity is:

<!ENTITY Name SYSTEM Location>

The word, SYSTEM, is used to identify the URI (Universal Resource Identifier) of the associated name. The location may be local or include a complete URL (Universal Resource Location).

Example:

<!ENTITY extquote SYSTEM "Benquote.xml">

External Unparsed Entity

The format of an unparsed entity uses the additional word, NDATA, to indicate the information is not parsed. The NotationName either describes the data format of the referenced file or references a program to be used to parse the file. The NOTATION statement (elsewhere in the DTD) is used to set the notation for the notation keyword.

<!ENTITY Name SYSTEM Location NDATA NotationName>

For example:

<!ENTITY MyScript SYSTEM "Script1.pl" NDATA "pl">

The line above references the notation name "pl" which may be defined elsewhere in the DTD using a notation statement:

<!NOTATION pl SYSTEM "/usr/bin/perl" >

Internal Parameter Entity

The internal parameter entity is the most common entity in the HTML DTD. Its format is:

<!ENTITY % Name "value">

Example:

<!ENTITY % treeparts "roots | trunk | branches | leaves">

Anywhere "%treeparts" is referenced in the DTD after this entity declaration, the string "roots | trunk | branches | leaves" will be substituted. Other entity names may be used in the entity declaration.
A more complicated example from the HTML 4 transitional DTD is:

<!ENTITY % coreattrs 
"id     ID      #IMPLIED -- document-wide unique id --
 class  CDATA   #IMPLIED -- space-separated list of classes --
 style  %StyleSheet;   #IMPLIED -- associated style info --
 title  %Text;  #IMPLIED -- advisory title --"
 >

It is used later in the DTD:

<!ATTLIST BR
 %coreattrs; -- id, class, style, title --
 clear (left|all|right|none) none -- control of text flow -- 
 >

Note that in this example some other entities are also referenced. This means that the %coreattrs entity also contains the contents of the specified entities (variables) %StyleSheet and %Text along with ID and CDATA.

If the entity is used to reference an external file, the contents of the file are substutituted for the %Name string. The entity declaration can be used to reference an external file the same as the external unparsed entity above except the % sign is between the word ENTITY and the entity name.

No comments:

Post a Comment

Popular Posts