Easily maintain RDF metadata models

Use scripts developed with the Jena RDF API to manage Resource Description Framework models

Similarly to how XML became the well-known standard adapted by many software vendors for data exchange, Resource Description Framework (RDF) is going in the same direction for describing and interchanging metadata. XML describes data using a document type definition (DTD) or an XML Schema Definition (XSD). RDF uses XML syntax and RDF Schema (RFDS) to describe the metadata as a data model.

This article explains how to use custom utilities developed with the Jena RDF API for managing RDF models stored in either a relational database or a file. Developed by HP Labs, the Jena framework is an open source implementation of RDF, RDFS, and OWL (Web Ontology Language) and includes a rule-based inference engine. It provides a Java API for creating and manipulating RDF models. In this article, I introduce SemanticRDFUtils.bat, a script developed with Jena that includes several tasks for maintaining Jena RDF metadata models stored in a relational database or a flat file. This article also explains how to use Prot�g� for creating semantic RDF files that include the schema (.rdfs) and the data file (.rdf).

Software installation

The following software must be installed before using SemanticRDFUtils.bat. Links to these tools are included in Resources.

  • J2SE 1.3 or a more recent version
  • Jena 2.0
  • Oracle 9.2.0.1.0
  • Apache Ant 1.5.4 or a more recent version
  • Prot�g� 2.1

A quick look at RDF and RDFS files

The following XML listings show the RDF and RDFS files for a sample alphabet cross reference model. They were created using the Prot�g� 2.1 GUI tool. The RDF file can be used as input while running the scripts and RDF query tool. The RDFS file is useful when you work with Prot�g� to add more data to the RDF file.

Listing 1. RDFTest1.rdf

                        <?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE rdf:RDF [
   <!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
   <!ENTITY rdfs 'http://www.w3.org/TR/1999/PR-rdf-schema-19990303#'>
   <!ENTITY Maana 'http://www.vvasam.com/Maana#'>
]>
<rdf:RDF xmlns:rdf=""
   xmlns:Maana=""
   xmlns:rdfs="">
<Maana:ASCII rdf:about="RDFTest_Instance_0"
   Maana:Name="A"
   Maana:value="65"
   rdfs:label="A:65">
   <Maana:system rdf:resource="RDFTest_Instance_2"/>
</Maana:ASCII>
<Maana:System rdf:about="RDFTest_Instance_1"
   Maana:Name="lowercase"
   rdfs:label="lowercase"/>
<Maana:ASCII rdf:about="RDFTest_Instance_10000"
   Maana:Name="b"
   Maana:value="98"
   rdfs:label="b:98">
   <Maana:system rdf:resource="RDFTest_Instance_1"/>
</Maana:ASCII>
<Maana:ASCII rdf:about="RDFTest_Instance_10001"
   Maana:Name="B"
   Maana:value="66"
   rdfs:label="B:66">
   <Maana:system rdf:resource="RDFTest_Instance_2"/>
</Maana:ASCII>
<Maana:AscXRef rdf:about="RDFTest_Instance_10002"
   rdfs:label="b:98:B:66">
   <Maana:keyName rdf:resource="RDFTest_Instance_10000"/>
   <Maana:keyValue rdf:resource="RDFTest_Instance_10001"/>
</Maana:AscXRef>
<Maana:AscXRef rdf:about="RDFTest_Instance_10005"
   rdfs:label="a:97:A:65">
   <Maana:keyValue rdf:resource="RDFTest_Instance_0"/>
   <Maana:keyName rdf:resource="RDFTest_Instance_8"/>
</Maana:AscXRef>
<Maana:System rdf:about="RDFTest_Instance_2"
   Maana:Name="uppercase"
   rdfs:label="uppercase"/>
<Maana:ASCII rdf:about="RDFTest_Instance_8"
   Maana:Name="a"
   Maana:value="97"
   rdfs:label="a:97">
   <Maana:system rdf:resource="RDFTest_Instance_1"/>
</Maana:ASCII>
</rdf:RDF>
                   

Listing 2. RDFTest1.rdfs

                        <?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE rdf:RDF [
   <!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
   <!ENTITY system 'http://protege.stanford.edu/system#'>
   <!ENTITY Maana 'http://www.vvasam.com/Maana#'>
   <!ENTITY rdfs 'http://www.w3.org/TR/1999/PR-rdf-schema-19990303#'>
]>
<rdf:RDF xmlns:rdf=""
   xmlns:system=""
   xmlns:rdfs=""
   xmlns:Maana="">
<rdf:Property rdf:about="maxCardinality"
   rdfs:label="system:maxCardinality"/>
<rdf:Property rdf:about="minCardinality"
   rdfs:label="system:minCardinality"/>
<rdf:Property rdf:about="range"
   rdfs:label="system:range"/>
<rdfs:Class rdf:about="ASCII"
   rdfs:label="ASCII">
   <rdfs:subClassOf rdf:resource="Resource"/>
</rdfs:Class>
<rdfs:Class rdf:about="AscXRef"
   rdfs:label="AscXRef">
   <rdfs:subClassOf rdf:resource="Resource"/>
</rdfs:Class>
<rdf:Property rdf:about="Name"
   rdfs:label="Name">
   <rdfs:domain rdf:resource="ASCII"/>
   <rdfs:domain rdf:resource="System"/>
   <rdfs:range rdf:resource="Literal"/>
</rdf:Property>
<rdf:Property rdf:about="RDFTest_Slot_10003"
   rdfs:label="RDFTest_Slot_10003">
   <rdfs:range rdf:resource="Literal"/>
</rdf:Property>
<rdfs:Class rdf:about="System"
   rdfs:label="System">
   <rdfs:subClassOf rdf:resource="Resource"/>
</rdfs:Class>
<rdf:Property rdf:about="keyName"
   rdfs:label="keyName">
   <rdfs:range rdf:resource="ASCII"/>
   <rdfs:domain rdf:resource="AscXRef"/>
</rdf:Property>
<rdf:Property rdf:about="keyValue"
   rdfs:label="keyValue">
   <rdfs:range rdf:resource="ASCII"/>
   <rdfs:domain rdf:resource="AscXRef"/>
</rdf:Property>
<rdf:Property rdf:about="system"
   rdfs:label="system">
   <rdfs:domain rdf:resource="ASCII"/>
   <rdfs:range rdf:resource="System"/>
</rdf:Property>
<rdf:Property rdf:about="value"
   rdfs:label="value">
   <rdfs:domain rdf:resource="ASCII"/>
   <rdfs:range rdf:resource="Literal"/>
</rdf:Property>
</rdf:RDF>
                   

Overview of Jena and Prot�g�

The following sections give a high-level overview of Jena and Prot�g�. You can get more detailed information on both products from the links in Resources. For this article's purposes, I expect you already have a good understanding of Jena and Prot�g�.

Jena RDF and RDQL

The RDF data model is a collection of statements, with each statement consisting of three parts: resource, property, and value. The resource can be anything that can be identified by a URI and it can have properties. Each property contains its own values. A property, value, and statement can be a resource and have their own properties and values.

Jena persists the RDF model in either a database or a file. RDQL is query language for querying the RDF model. RDF provides a graph with directed edges where the nodes can be either resources or literals. RDQL offers a way of specifying a graph pattern that is matched against the graph to yield a set of matches. Figure 1 shows the RDF graph representation of the files shown in Listings 1 and 2.

Figure 1. RDF graph representation for the sample RDF file. Click on thumbnail to view full-sized image.

An ellipse represents the resource, and a rectangle represents the literal. The resource (subject) is linked to another resource or literal (object or value) through an arc, or arrow, (predicate or property), which can be considered a triple and is called a statement.

The query below is an example RDQL query. The triple (?x <http://www.vvasam.com/Maana#value> "97") in the query is a statement. The x is a bind variable that represents a resource; http://www.vvasam.com/Maana#value is a property with name value; and 97 is the value of the property.

SELECT ?x WHERE (?x <http://www.vvasam.com/Maana#value> "97")

The Jena toolkit provides a Java class (jjena.rdfquery) that can be executed from a command line to carry out RDQL queries. The following shows the results from executing the query shown above by saving it as test1.rdql.

                        java jena.rdfquery --data RDFTest1.rdf --query test1.rdql
x
================================================
http://www.vvasam.com/Maana#RDFTest_Instance_8
                   

Note:Browse through the links in Resources for more information on RDF and RDQL.

RDF using Prot�g�

Prot�g� is a GUI tool for creating and editing ontologies and knowledge bases. Prot�g� can create and store data in RDF format. To create an RDF model in Prot�g�, the RDF Schema format must be selected when creating a project, as shown in Figure 2.

Figure 2. RDF Schema project

The Select Format box appears when New is selected from Prot�g�'s Project menu. After clicking the OK button, the window in Figure 3 appears.

Figure 3. Default Prot�g� project screen. Click on thumbnail to view full-sized image.

As you can see from Figure 3, Prot�g� has several tabs. In this article, I briefly discuss the Classes, Instances, and Algernon tabs.

Figure 4 shows Prot�g�'s Save dialog box. It includes entries for typing the names of the project, the classes file, instances file, and namespace. As shown in Figure 4, Classes File Name contains RDF Schema information, and Instances File Name contains the RDF data. Namespace identifies the RDF model with a unique URI.

Figure 4. Prot�g�'s Save dialog

Figures 5 and 6 show the Prot�g� Classes and Instances tab, respectively, of the .rdf and .rdfs files shown in Listings 1 and 2. The files are created using Prot�g� RDF schema format.

Figure 5. Prot�g� class tab. Click on thumbnail to view full-sized image.
Figure 6. Protégé instances tab. Click on thumbnail to view full-sized image.

Algernon queries in Prot�g�

Prot�g�'s Algernon query tab provides a UI for running Algernon queries and viewing the results. Algernon is triple-based query language that returns resources based on the traversal path as shown in Figure 7. By default, the Algernon query tab doesn't appear in the view. To see this tab, it must be selected from the Configure submenu in the Project menu.

Figure 7. Algernon tab. Click on thumbnail to view full-sized image.

Terminology mapping between Jena and Prot�g�

Jena and Prot�g� are two separate open source technologies, so they have RDF terminology differences. The following table maps the terminology to make things easier when using them for creating and manipulating RDF documents.

Table 1. Jena and Prot�g� terminology comparison

Jena Prot�g� Comments
ResourceClassThe values of a resource's properties can be seen in the instances tab of Prot�g�.
PropertySlot 
RDF ModelRDF ProjectJena can manipulate .rdf files without a .rdfs file. But Prot�g� requires both .rdf and .rdfs files.

Jena semantic RDF utilities

This section explains utilities/scripts useful for maintaining the Jena database and file models. The script files are in the SemanticRDFUtils-scripts-files.zip file, which you can download from Resources. The list below describes the tasks you can perform using the scripts. When you execute the SemanticRDFUtils batch file with no command line parameters as a task ID, the following appears on your console:

                        C:\RDF\SemanticRDFUtils
Usage: SemanticRDFUtils taskid
   Where taskid should be any one of the following:
    1 --> To create and initialize the Jena system tables with a system model name as JenaRDFSystem
    2 --> To create a database model
    3 --> To remove a database model.
    4 --> To list the contents of a given model.
    5 --> To import RDF/XML file to a database.
    6 --> To list existing database model names
    7 --> To export a database model to a RDF/XML file
    8 --> To delete all the contents of a database model
    9 --> To create a union(RDF/XML file) of RDF/XML file models
    10 --> To create an intersection(RDF/XML file) of RDF/XML file models
    11 --> To create a difference(RDF/XML file) of RDF/XML file models
    12 --> To get the size of the given model
    13 --> Export the RDF query results as RDF/XML file.
    14 --> Delete the resource(s) from a model based on RDF query.
                   

The SemanticRDFUtils script uses the SemanticRDFUtils.properties file to get configuration information. The following table shows all the properties you can configure in that file.

Table 2. Property configurations

1 2 Page
Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies
See more