SWI-Prolog Semantic Web Library 3.0
AllApplicationManualNameSummaryHelp

  • Documentation
    • Reference manual
    • Packages
      • SWI-Prolog Semantic Web Library 3.0
        • Introduction
        • Scalability
        • Two RDF APIs
        • Plugin modules for rdf_db
        • library(semweb/turtle): Turtle: Terse RDF Triple Language
        • library(semweb/rdf_ntriples): Process files in the RDF N-Triples format
        • library(semweb/rdfa): Extract RDF from an HTML or XML DOM
        • library(semweb/rdfs): RDFS related queries
        • Managing RDF input files
          • The Manifest file
          • Usage scenarios
          • Putting it all together
          • Example: A metadata file for W3C WordNet
        • library(semweb/sparql_client): SPARQL client library
        • library(semweb/rdf_compare): Compare RDF graphs
        • library(semweb/rdf_portray): Portray RDF resources
        • Related packages
        • Version 3 release notes

9 Managing RDF input files

Complex projects require RDF resources from many locations and typically wish to load these in different combinations. For example loading a small subset of the data for debugging purposes or load a different set of files for experimentation. The library library(semweb/rdf_library.pl) manages sets of RDF files spread over different locations, including file and network locations. The original version of this library supported metadata about collections of RDF sources in an RDF file called Manifest. The current version supports both the VoID format and the original format. VoID files (typically named void.ttl) can use elements from the RDF Manifest vocabulary to support features that are not supported by VoID.

9.1 The Manifest file

A manifest file is an RDF file, often in Turtle format, that provides meta-data about RDF resources. Often, a manifest will describe RDF files in the current directory, but it can also describe RDF resources at arbitrary URL locations. The RDF schema for RDF library meta-data can be found in rdf_library.ttl. The namespace for the RDF library format is defined as http://www.swi-prolog.org/rdf/library/ and abbreviated as lib.

The schema defines three root classes: lib:Namespace, lib:Ontology and lib:Virtual, which we describe below.

lib:Ontology
This is a subclass of owl:Ontology. It has two subclasses, lib:Schema and lib:Instances. These three classes are currently processed equally. The following properties are recognised on lib:Ontology:
dc:title
Title of the ontology. Displayed by rdf_list_library/0.
owl:versionInfo
Version of the ontology. Displayed by rdf_list_library/0.
owl:imports
Ontologies imported. If rdf_load_library/2 is used to load this ontology, the ontologies referenced here are loaded as well. There are two subProperties: lib:schema and lib:instances with the obvious meaning.
lib:source
Defines the named graph into which the resource is loaded. If this ends in a /, the basename of each loaded file is appended to the given source. Defaults to the URL the RDF is loaded from.
lib:baseURI
Defines the base for processing the RDF data. If not provided this defaults to the named graph, which in turn defaults to the URL the RDF is loaded from.
lib:Virtual
Virtual ontologies do not refer to an RDF resource themselves. They only import other resources. For example the W3C WordNet manifest defines wn-basic and wn-full as virtual resources. The lib:Virtual resource is used as a second rdf:type:
<wn-basic>
        a lib:Ontology ;
        a lib:Virtual ;
        ...
lib:CloudNode
Used by ClioPatria to combine this ontology and all data it imports into a node in the automatically generated datacloud.
lib:Namespace
Defines a URL to be a namespace. The definition provides the preferred mnemonic and can be referenced in the lib:providesNamespace and lib:usesNamespace properties. The rdf_load_library/2 predicates registers encountered namespace mnemonics with rdf-db using rdf_register_ns/2. Typically namespace declarations use @prefix declarations. E.g.
@prefix     lib: <http://www.swi-prolog.org/rdf/library/> .
@prefix    rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

[ a lib:Namespace ;
  lib:mnemonic "rdfs" ;
  lib:namespace rdfs:
] .

9.1.1 Support for the VoID and VANN vocabularies

The VoID aims at resolving the same problem as the Manifest files described here. In addition, the VANN vocabulary provides the information about preferred namepaces prefixes. The RDF library manager can deal with VoID files. The following relations apply:

  • VoID Dataset and Linkset are similar to lib:Ontology, but a VoID resource is always Virtual. I.e., the VoID URI itself never refers to an RDF document.

  • The owl:imports and its lib specializations are replaced by void:subset (referring to another VoID dataset) and void:dataDump (referring to a concrete document).

  • A description of the dataset is given using dcterms:description rather than rdfs:comment

  • The RDF library recognises lib:source, lib:baseURI and lib:Cloudnode, which have no equivalent in VoID.

  • The RDF library recognises vann:preferredNamespacePrefix and vann:preferredNamespaceUri as alternatives to its proprietary way for defining prefixes. The domain of these predicates is unclear. The library recognises them regardless of the domain. Note that the range of vann:preferredNamespaceUri is a literal. A disadvantage of that is that the Turtle prefix declaration cannot be reused.

Currently, the RDF metadata is not stored in the RDF database. It is processed by low-level primitives that do not perform RDFS reasoning. In particular, this means that rdfs:supPropertyOf and rdfs:subClassOf cannot be used to specialise the RDF meta vocabulary.

9.1.2 Finding manifest files

The initial metadata file(s) are loaded into the system using rdf_attach_library/1.

rdf_attach_library(+FileOrDirectory)
Load meta-data on RDF repositories from FileOrDirectory. If the argument is a directory, this directory is processed recursively and each for each directory, a file named void.ttl, Manifest.ttl or Manifest.rdf is loaded (in this order of preference).

Declared namespaces are added to the rdf-db namespace list. Encountered ontologies are added to a private database of rdf_list_library.pl. Each ontology is given an identifier, derived from the basename of the URL without the extension. This, using the declaration below, the identifier of the declared ontology is wn-basic.

<wn-basic>
        a void:Dataset ;
        dcterms:title "Basic WordNet" ;
        ...
rdf_list_library
List the available resources in the library. Currently only lists resources that have a dcterms:title property. See section 9.2 for an example.

It is possible for the initial set of manifests to refer to RDF files that are not covered by a manifest. If such a reference is encountered while loading or listing a library, the library manager will look for a manifest file in the directory holding the referenced RDF file and load this manifest. If a manifest is found that covers the referenced file, the directives found in the manifest will be followed. Otherwise the RDF resource is simply loaded using the current defaults.

Further exploration of the library is achieved using rdf_list_library/1 or rdf_list_library/2:

rdf_list_library(+Id)
Same as rdf_list_library(Id,[]).
rdf_list_library(+Id, +Options)
Lists the resources that will be loaded if Id is handed to rdf_load_library/2. See rdf_attach_library/1 for how ontology identifiers are generated. In addition it checks the existence of each resource to help debugging library dependencies. Before doing its work, rdf_list_library/2 reloads manifests that have changed since they were loaded the last time. For HTTP resources it uses the HEAD method to verify existence and last modification time of resources.
rdf_load_library(+Id, +Options)
Load the given library. First rdf_load_library/2 will establish what resources need to be loaded and whether all resources exist. Than it will load the resources.

9.2 Usage scenarios

Typically, a project will use a single file using the same format as a manifest file that defines alternative configurations that can be loaded. This file is loaded at program startup using rdf_attach_library/1. Users can now list the available libraries using rdf_list_library/0 and rdf_list_library/1:

1 ?- rdf_list_library.
ec-core-vocabularies E-Culture core vocabularies
ec-all-vocabularies All E-Culture vocabularies
ec-hacks            Specific hacks
ec-mappings         E-Culture ontology mappings
ec-core-collections E-Culture core collections
ec-all-collections  E-Culture all collections
ec-medium           E-Culture medium sized data (artchive+aria)
ec-all              E-Culture all data

Now we can list a specific category using rdf_list_library/1. Note this loads two additional manifests referenced by resources encountered in ec-mappings. If a resource does not exist is is flagged using [NOT FOUND].

2 ?- rdf_list_library('ec-mappings').
% Loaded RDF manifest /home/jan/src/eculture/vocabularies/mappings/Manifest.ttl
% Loaded RDF manifest /home/jan/src/eculture/collections/aul/Manifest.ttl
<file:///home/jan/src/eculture/src/server/ec-mappings>
. <file:///home/jan/src/eculture/vocabularies/mappings/mappings>
. . <file:///home/jan/src/eculture/vocabularies/mappings/interface>
. . . file:///home/jan/src/eculture/vocabularies/mappings/interface_class_mapping.ttl
. . . file:///home/jan/src/eculture/vocabularies/mappings/interface_property_mapping.ttl
. . <file:///home/jan/src/eculture/vocabularies/mappings/properties>
. . . file:///home/jan/src/eculture/vocabularies/mappings/ethnographic_property_mapping.ttl
. . . file:///home/jan/src/eculture/vocabularies/mappings/eculture_properties.ttl
. . . file:///home/jan/src/eculture/vocabularies/mappings/eculture_property_semantics.ttl
. . <file:///home/jan/src/eculture/vocabularies/mappings/situations>
. . . file:///home/jan/src/eculture/vocabularies/mappings/eculture_situations.ttl
. <file:///home/jan/src/eculture/collections/aul/aul>
. . file:///home/jan/src/eculture/collections/aul/aul.rdfs
. . file:///home/jan/src/eculture/collections/aul/aul.rdf
. . file:///home/jan/src/eculture/collections/aul/aul9styles.rdf
. . file:///home/jan/src/eculture/collections/aul/extractedperiods.rdf
. . file:///home/jan/src/eculture/collections/aul/manual-periods.rdf

9.2.1 Referencing resources

Resources and manifests are located either on the local filesystem or on a network resource. The initial manifest can also be loaded from a file or a URL. This defines the initial base URL of the document. The base URL can be overruled using the Turtle @base directive. Other documents can be referenced relative to this base URL by exploiting Turtle's URI expansion rules. Turtle resources can be specified in three ways, as absolute URLs (e.g. <http://www.example.com/rdf/ontology.rdf>), as relative URL to the base (e.g. <../rdf/ontology.rdf>) or following a prefix (e.g. prefix:ontology).

The prefix notation is powerful as we can define multiple of them and define resources relative to them. Unfortunately, prefixes can only be defined as absolute URLs or URLs relative to the base URL. Notably, they cannot be defined relative to other prefixes. In addition, a prefix can only be followed by a Qname, which excludes . and /.

Easily relocatable manifests must define all resources relative to the base URL. Relocation is automatic if the manifest remains in the same hierarchy as the resources it references. If the manifest is copied elsewhere (i.e. for creating a local version) it can use @base to refer to the resource hierarchy. We can point to directories holding manifest files using @prefix declarations. There, we can reference Virtual resources using prefix:name. Here is an example, were we first give some line from the initial manifest followed by the definition of the virtual RDFS resource.

@base <http://gollem.science.uva.nl/e-culture/rdf/> .

@prefix base:           <base_ontologies/> .

<ec-core-vocabularies>
        a lib:Ontology ;
        a lib:Virtual ;
        dc:title "E-Culture core vocabularies" ;
        owl:imports
                base:rdfs ,
                base:owl ,
                base:dc ,
                base:vra ,
                ...
<rdfs>
        a lib:Schema ;
        a lib:Virtual ;
        rdfs:comment "RDF Schema" ;
        lib:source rdfs: ;
        lib:schema <rdfs.rdfs> .

9.3 Putting it all together

In this section we provide skeleton code for filling the RDF database from a password protected HTTP repository. The first line loads the application. Next we include modules that enable us to manage the RDF library, RDF database caching and HTTP connections. Then we setup the HTTP authentication, enable caching of processed RDF files and load the initial manifest. Finally load_data/0 loads all our RDF data.

:- use_module(server).

:- use_module(library(http/http_open)).
:- use_module(library(semweb/rdf_library)).
:- use_module(library(semweb/rdf_cache)).

:- http_set_authorization('http://www.example.org/rdf',
                          basic(john, secret)).

:- rdf_set_cache_options([ global_directory('RDF-Cache'),
                           create_global_directory(true)
                         ]).


:- rdf_attach_library('http://www.example.org/rdf/Manifest.ttl').

%%      load_data
%
%       Load our RDF data

load_data :-
        rdf_load_library('all').

9.4 Example: A metadata file for W3C WordNet

The VoID metadata below allows for loading WordNet in the two predefined versions using one of

?- rdf_load_library('wn-basic', []).
?- rdf_load_library('wn-full', []).
@prefix    void: <http://rdfs.org/ns/void#> .
@prefix    vann: <http://purl.org/vocab/vann/> .
@prefix     lib: <http://www.swi-prolog.org/rdf/library/> .
@prefix     owl: <http://www.w3.org/2002/07/owl#> .
@prefix     rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix    rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix     xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix      dc: <http://purl.org/dc/terms/> .
@prefix   wn20s: <http://www.w3.org/2006/03/wn/wn20/schema/> .
@prefix   wn20i: <http://www.w3.org/2006/03/wn/wn20/instances/> .

[ vann:preferredNamespacePrefix "wn20i" ;
  vann:preferredNamespaceUri "http://www.w3.org/2006/03/wn/wn20/instances/"
] .

[ vann:preferredNamespacePrefix "wn20s" ;
  vann:preferredNamespaceUri "http://www.w3.org/2006/03/wn/wn20/schema/"
] .

<wn20-common>
        a void:Dataset ;
        dc:description "Common files between full and basic version" ;
        lib:source wn20i: ;
        void:dataDump
                <wordnet-attribute.rdf.gz> ,
                <wordnet-causes.rdf.gz> ,
                <wordnet-classifiedby.rdf.gz> ,
                <wordnet-entailment.rdf.gz> ,
                <wordnet-glossary.rdf.gz> ,
                <wordnet-hyponym.rdf.gz> ,
                <wordnet-membermeronym.rdf.gz> ,
                <wordnet-partmeronym.rdf.gz> ,
                <wordnet-sameverbgroupas.rdf.gz> ,
                <wordnet-similarity.rdf.gz> ,
                <wordnet-synset.rdf.gz> ,
                <wordnet-substancemeronym.rdf.gz> ,
                <wordnet-senselabels.rdf.gz> .

<wn20-skos>
        a void:Dataset ;
        void:subset <wnskosmap> ;
        void:dataDump <wnSkosInScheme.ttl.gz> .

<wnskosmap>
        a lib:Schema ;
        lib:source wn20s: ;
        void:dataDump
                <wnskosmap.rdfs> .

<wnbasic-schema>
        a void:Dataset ;
        lib:source wn20s: ;
        void:dataDump
                <wnbasic.rdfs> .

<wn20-basic>
        a void:Dataset ;
        a lib:CloudNode ;
        dc:title "Basic WordNet" ;
        dc:description "Light version of W3C WordNet" ;
        owl:versionInfo "2.0" ;
        lib:source wn20i: ;
        void:subset
                <wnbasic-schema> ,
                <wn20-skos> ,
                <wn20-common> .

<wnfull-schema>
        a void:Dataset ;
        lib:source wn20s: ;
        void:dataDump
                <wnfull.rdfs> .

<wn20-full>
        a void:Dataset ;
        a lib:CloudNode ;
        dc:title "Full WordNet" ;
        dc:description "Full version of W3C WordNet" ;
        owl:versionInfo "2.0" ;
        lib:source wn20i: ;
        void:subset
                <wnfull-schema> ,
                <wn20-skos> ,
                <wn20-common> ;
        void:dataDump
                <wordnet-antonym.rdf.gz> ,
                <wordnet-derivationallyrelated.rdf.gz> ,
                <wordnet-participleof.rdf.gz> ,
                <wordnet-pertainsto.rdf.gz> ,
                <wordnet-seealso.rdf.gz> ,
                <wordnet-wordsensesandwords.rdf.gz> ,
                <wordnet-frame.rdf.gz> .