Thunderstone Software Document Search, Retrieval, and Management
Search:
Vortex Manual
 

DTDs, Entities, and Entity References

The xmlTree API has limited support for reading DTDs, geared towards using entities defined in an XML document. All DTD objects are xmlNode's with various type values - XML_DTD_NODE, XML_ENTITY_DECL, etc.

It's possible for an XML document two have two different DTDs, an internal DTD and an external DTD. The XML_DTD_NODE object from these can be fetched with the xmlTreeGetInternalSubset() and xmlTreeGetExternalSubset() functions, respectively.

Entities are declared in the DTD, and when they are used in a document, XML_ENTITY_REF nodes are used to refer back to the entities. Consider the following XML document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE top [
<!ENTITY ts "Thunderstone Software, LLC.">
<!ELEMENT top (#PCDATA)>
]>
<top>ts is &ts;!</top>
</capture>

This defines a single entity, ts, and it is referenced in the element top. Entities don't have to be dealt with if you don't want to:

  • You can permanently substitute them when parsing with the XML_PARSE_NOENT option. This substitutes the entity's value in the tree, so in the previous example <top> will only have one text child, ts is Thunderstone Software, LLC.!, rather than an entity reference. See xmlTreeNewDocFromFile() (here) and xmlTreeNewDocFromString() (here) for more information.

  • calling xmlTreeGetContent() on <top> will return ts is Thunderstone Software, LLC.!, which performs the entity substitution for you. It can also be called with NO_INLINE to leave entity references in place. See xmlTreeGetContent() (here) for more information.

The example document would have the following hierarchical structure in the xmlTree API:

XML_DOC_NODE
  |  |
  |  |
  | XML_DTD_NODE
  |  |
  |  +-XML_ENTITY_DECL <------\
  |  |                        |
  |  \-XML_ELEMENT_DECL       |
  |                           |
  \-XML_ELEMENT_NODE <top>    |
    |                         |
    +-XML_TEXT_NODE "ts is "  |
    |                         |
    +-XML_ENTITY_REF          |
    |   |                     |
    |   \---------------------/
    |
    \-XML_TEXT_NODE "!"

The element <top> actually has three children; the entity reference, and the two text node children around it. The entity reference appears to have a child, which just refers back to the entity that was declared in the DTD. Calling xmlTreeGetAllContent() on the XML_ENTITY_REF node will properly return the entity's contents.

See the sample xmlTree09-DTD for an example of working with an XML document and DTD like this.


Copyright © Thunderstone Software     Last updated: Mon Feb 18 10:28:15 EST 2013
 
Home   ::   Products   ::   Solutions   ::   How to Buy   ::   Support   ::   Contact Us   ::   News   ::   About
Copyright © 2013 Thunderstone Software LLC. All rights reserved.