SWI-Prolog SGML/XML parser
AllApplicationManualNameSummaryHelp

  • Documentation
    • Reference manual
    • Packages
      • SWI-Prolog SGML/XML parser
        • Introduction
        • Bluffer's Guide
        • Predicate Reference
        • Stream encoding issues
        • library(xpath): Select nodes in an XML DOM
        • Processing Indexed Files
        • External entities
          • sgml_register_catalog_file/2
        • library(pwp): Prolog Well-formed Pages
        • Writing markup
        • Unsupported SGML features
        • Acknowledgements

7 External entities

While processing an SGML document the document may refer to external data. This occurs in three places: external parameter entities, normal external entities and the DOCTYPE declaration. The current version of this tool deals rather primitively with external data. External entities can only be loaded from a file and the mapping between the entity names and the file is done using a catalog file in a format compatible with that used by James Clark's SP Parser, based on the SGML Open (now OASIS) specification.

Catalog files can be specified using two primitives: the predicate sgml_register_catalog_file/2 or the environment variable SGML_CATALOG_FILES (compatible with the SP package).

sgml_register_catalog_file(+File, +Location)
Register the indicated File as a catalog file. Location is either start or end and defines whether the catalog is considered first or last. This predicate has no effect if File is already part of the catalog.

If no files are registered using this predicate, the first query on the catalog examines SGML_CATALOG_FILES and fills the catalog with all files in this path.

Two types of lines are used by this package.

DOCTYPE doctype file
PUBLIC "Id" file

The specified file path is taken relative to the location of the catolog file. For the DOCTYPE declaraction, library(sgml) first makes an attempt to resolve the SYSTEM or PUBLIC identifier. If this fails it tries to resolve the doctype using the provided catalog files.

Strictly speaking, library(sgml) breaks the rules for XML, where system identifiers must be Universal Resource Indicators, not local file names. Simple uses of relative URIs will work correctly under UNIX and Windows.

In the future we will design a call-back mechanism for locating and processing external entities, so Prolog-based file-location and Prolog resources can be used to store external entities.