SWI-Prolog SGML/XML parser
AllApplicationManualNameSummaryHelp

  • Documentation
    • Reference manual
    • Packages
      • SWI-Prolog SGML/XML parser
        • Introduction
        • Bluffer's Guide
        • Predicate Reference
        • Stream encoding issues
        • library(xpath): Select nodes in an XML DOM
        • Processing Indexed Files
        • External entities
        • library(pwp): Prolog Well-formed Pages
        • Writing markup
        • Unsupported SGML features
        • Acknowledgements

4 Stream encoding issues

The parser can deal with ISO Latin-1 and UTF-8 encoded files, doing decoding based on the encoding argument provided to set_sgml_parser/2 or, for XML, based on the encoding attribute of the XML header. The parser reads from SWI-Prolog streams, which also provide encoding handling. Therefore, there are two modes for parsing. If the SWI-Prolog stream has encoding octet (which is the default for binary streams), the decoder of the SGML parser will be used and positions reported by the parser are octet offsets in the stream. In other cases, the Prolog stream decoder is used and offsets are character code counts.