DTD -- Data Type Definition. Use to define XML structure (i.e., tags and their tag relationships)
The DTD normally begins with:
<?xml version="1.0"?>
This is followed by the document type:
<!DOCTYPE doc_type [
The doc_type is the opening and closing tags of the document. Two examples are:
1. a doctype HTML begins with a <HTML> tag and ends with a </HTML>
2. an doctype XML begins with a <XML> tag and ends with a </XML>
A document is defined by one of more ELEMENTS:
<!ELEMENT element_name ( list-of-child-elements or data definition) >
The list-of-child-elements names all the child elements of element_name (i.e., it specifies their tag names) and specifies the element requirements by following the child (tag) name with either:
, (a comma means strict order)
? (element is optional)
+ (one or more elements)
* (zero or more elements)
| (select one of the elements)
( ) (groups elements together)
Elements may have either:
more tags
OR
data
Data is normally:
#PCDATA
This means that the content of the ELEMENT (i.e., the value between the tag pairs) is parsed character data or PCDATA. PCDATA cannot contain the characters "<", ">" or "&". To include these characters as data use "<" for <;, ">" for >, and "&" for &.
You can also specify data as CDATA which is unparsed character data where the characters "<", ">", and "&" are allowed.
An element may have attributes:
<!ATTLIST | name-of-element | name-of-attribute | CDATA
or
(list-of-attribute-values separated by |'s)
|
#REQUIRED #IMPLIED #FIXED default value | > |
A simple XML DTD:
<?xml version="1.0"?>
<!DOCTYPE purchase-order [
<!ELEMENT purchase-order (buyer-name, address+, city, state, zip, order-line+) >
<!ELEMENT buyer-name (#PCDATA) >
<!ELEMENT address (#PCDATA) >
<!ELEMENT city (#PCDATA) >
<!ELEMENT state (#PCDATA) >
<!ELEMENT zip (#PCDATA) >
<!ELEMENT order-line ( product, quantity, price) >
<!ELEMENT product (#PCDATA) >
<!ELEMENT quantity (#PCDATA) >
<!ELEMENT price (#PCDATA) >
]>
<purchase-order>
<buyer-name>Michael S. Parks</buyer-name>
<address>4099 Bayview Street</address>
<address>Apartment 5</address><city>Houston.</city>
<state>TX</state>
<zip>77001</zip>
<order-line>
<product>Wool Sweater </product >
<quantity>2</quantity>
<price>49.95 </price >
</order-line>
<order-line>
<product>Gloves</product >
<quantity>1</quantity>
<price>19.95 </price >
</order-line>
</purchase-order>
In general the number of "+" will indicate the number of 'for loops' that will be required to process a branch.
In the example above a "purchase order" may contain more than one "address" and one or more "order-lines".