3.6 Parsing Primitives
AllApplicationManualNameSummaryHelp

  • Documentation
    • Reference manual
    • Packages
      • SWI-Prolog SGML/XML parser
        • Predicate Reference
          • Parsing Primitives
            • new_sgml_parser/2
            • free_sgml_parser/1
            • set_sgml_parser/2
            • get_sgml_parser/2
            • sgml_parse/2
            • Partial Parsing
Availability::- use_module(library(sgml)).(can be autoloaded)
set_sgml_parser(+Parser, +Option)
Sets attributes to the parser. Currently defined attributes:
file(File)
Sets the file for reporting errors and warnings. Sets the line to 1.
line(Line)
Sets the current line. Useful if the stream is not at the start of the (file) object for generating proper line-numbers.
linepos(LinePos)
Sets notion of the current column in the source line.
charpos(Offset)
Sets the current character location. See also the file(File) option.
position(Position)
Set source location from a stream position term as obtained using stream_property(Stream, position(Position)).
dialect(Dialect)
Set the markup dialect. Known dialects:
sgml
The default dialect is to process as SGML. This implies markup is case-insensitive and standard SGML abbreviation is allowed (abreviated attributes and omitted tags).
html
html4
This is the same as sgml, but implies shorttag(false) and accepts XML empty element declarations (e.g., <img src="..."/>).
html5
In addition to html, accept attributes named data- without warning. This value initialises the charset to UTF-8.
xhtml
xhtml5
These document types are processed as xml. Dialect xhtml5 accepts attributes named data- without warning.
xml
This dialect is selected automatically if the processing instruction <?xml ...> is encountered. See section 3.3 for details.
xmlns
Process file as XML file with namespace support. See section 3.3.1 for details. See also the qualify_attributes option below.
xmlns(+URI)
Set the default namespace of the outer environment. This option is provided to process partial XML content with proper namespace resolution.
xmlns(+NS, +URI)
Specify a namespace for the outer environment. This option is provided to process partial XML content with proper namespace resolution.
qualify_attributes(Boolean)
How to handle unqualified attribute (i.e. without an explicit namespace) in XML namespace (xmlns) mode. Default and standard compliant is not to qualify such elements. If true, such attributes are qualified with the namespace of the element they appear in. This option is for backward compatibility as this is the behaviour of older versions. In addition, the namespace document suggests unqualified attributes are often interpreted in the namespace of their element.
space(SpaceMode)
Define the initial handling of white-space in PCDATA. This attribute is described in section 3.2.
number(NumberMode)
If token (default), attributes of type number are passed as a Prolog atom. If integer, such attributes are translated into Prolog integers. If the conversion fails (e.g. due to overflow) a warning is issued and the value is passed as an atom.
encoding(Encoding)
Set the initial encoding. The default initial encoding for XML documents is UTF-8 and for SGML documents ISO-8859-1. XML documents may change the encoding using the encoding= attribute in the header. Explicit use of this option is only required to parse non-conforming documents. Currently accepted values are iso-8859-1 and utf-8.
doctype(Element)
Defines the toplevel element expected. If a <!DOCTYPE declaration has been parsed, the default is the defined doctype. The parser can be instructed to accept the first element encountered as the toplevel using doctype(_). This feature is especially useful when parsing part of a document (see the parse option to sgml_parse/2.