Homework #5: OWL Constraint Checker

Due: Thursday, Oct. 16

In this assignment, you will write a tool that takes as input a simple OWL ontology and an OWL data document, and does some basic checks to see if the data document obeys the constraints of the ontology. In order to keep this task manageable, we will only be concerned with checking the following constraints:

Furthermore, the input ontologies we use will not have:

When checking constraints, the tool should make both the unique names assumption and the closed world assumption. That is, you can assume that if two resources have different URIs then they are distinct, and you can assume that all relevant data is specified in the data file. Therefore, if there was a minCardinality of 2 on a property, and an instance only had one value for that property, that would be considered a constraint violation that the tool should report. A particular class may have multiple constraints, and each should be checked for any instance of the class. (Note, although you should not usually make the unique names assumption and the closed world assumption with respect to the Semantic Web, it is okay for a particular application with known ontologies and data to do so).

Your file should be run as:

java Checker ontfile datafile

where ontfile is the pathname of an OWL ontology file, and datafile is the pathname of a file with OWL instance data.

The output of your program should be:

No constraints violated.
or a list of specific error messages. In the case of an instance that is of an undefined type, report an error message of the form:

ERROR: Undefined Type - Instance instance_id member of class class_id

In the case of the violation of a restriction, the message should report the ID of the instance in which the error occurs, the type of constraint that is violated, the class in which this constraint is specified, and the property on which the constraint is placed. For example:

ERROR: Property Restriction Violation
       Instance: band1
       Class: Band
       Property: hasMember
       Constraint: minCardinality = 2

Use of Jena

You must use Jena to parse the input files. You will need to use classes from the com.hp.hpl.jena.rdf.model package, and may find the com.hp.hpl.jena.vocabulary package useful as well. Do not directly use any other packages from the Jena distribution without my permission. In particular, you are forbidden to use Jena's ontology or reasoner components. Note, for Jena to run correctly, you will need to include the jena.jar, log4j-1.2.7.jar and icu4j.jar files in your classpath.

Design Hints

Even with the simplifications specified above, this task can be challenging, so here is a suggested approach to solving this problem:

  1. Create a set of Java class that can parse an ontology and store basic OWL class information, including superclasses and property restrictions. In order to help you out, I have provided three unfinished classes that you may use (OwlOnt.java, OwlClass.java, and OwlProperty.java). These classes provide basic data structures and some simple access methods. However you will have to implement the methods for parsing an ontology from an OWL file, and eventually for testing if one OWL class is a subclass of another class (see Step 4 for the later). You may modify these Java classes in any way you wish, and may also choose not to use them at all. In any case, be careful when trying to determine the superclasses of an OWL class. The property rdfs:subClassOf can be used either with a named class or with an anonymous owl:Restriction. These two forms should be treated differently by your application.
  2. Write code that reads a data file, determines what the instances are, and verifies that all types correspond to a Class in the ontology file. Print an error message for each instance that fails this test. You can ignore instances that are untyped.
  3. Since the cardinality and hasValue constraints should be the easiest to check, you should next write code that checks whether each typed instance identified in Step 2 obeys these constraints, and reports any violations (e.g., an instance has a property that doesn't have enough values, or has a property that doesn't include the value specified by a owl:hasValue restriction). Note this must be done after steps 1 and 2, because it requires the program to know the type of the instance and what constraints are applicable for that type (as specified in the ontology). Since the constraints of any superclasses of the type should also apply, be sure your program checks these as well.
  4. Write code to test if one class is a subClassOf another, whether implicitly or explicitly. Because we are restricting ourselves to very simple ontologies, we only need to look at the explicit rdfs:subClassOf relations and any transitive inferences that result (e.g., if A is a subclass of B, and B is a subclass of C, then we can conclude that A is also a subclass of C). Note, that more complex ontologies would require sophisticated reasoning methods to determine all of the implicit subClassOf relations.
  5. Write code to check the owl:allValuesFrom and owl:someValuesFrom constraints. Note that you must take extra care when checking these kinds of constraints. An instance of a class is also an instance of all of its superclasses, so be sure to take this into account when you try to determine if the value of a property is of the type specified by the constraint.


Submission Instructions

This assignment is due by the beginning of class on Thursday, Oct. 16. Create a zip or tar file that contains both your source code (.java) and compiled (.class) files (but do not include any of the Jena files in it). If you used the three files I provided, make sure they are included. Also, if you use a .BAT file or some other form of script to compile and run your program, please include this as well. Send the combined file to heflin@cse.lehigh.edu with subject line "CSE 497 - Homework #5". Also print out your .java files, and turn them in during class. All of your files should be reasonably commented, including an intitial comment that identifies you as the author and descriptive comments for each class and method.