Homework #2

Due: Thursday, Sept. 18

In your previous homework, you were asked to define a DTD and produce an XML document for a fake bookstore. Even though different bookstores should be very similar, the flexibility of XML allowed you to create very different DTDs. Assume now that you are a bookstore customer, and you want to compare bookstores. In order to compare them, you need to transform the data into a common format.

In this assignment, you are to write a program that takes as input the XML files of two different bookstores, and outputs the combined data in the format of a third DTD.

The two input files are:

You may assume that these files are valid with respect to their DTDs.

The output should be an XML file that is valid with respect to the following DTD:

Don't forget to inlcude this DTD in the external subset of the output file. Also use an absolute URL to identify its location.

Your program must use either the Xerces 2.5.0 SAX parser or the Xerces 2.5.0 DOM parser to read the input files. How you produce the output file is up to you. Your program must be capable of converting similar XML files that use the same DTDs as the sample input files. However, you can assume that a consistent style is used in these files. In particular, you can assume that no entity references are used in any files. Note, before you write your program, you will have to determine which elements and attributes in the input files correspond to which elements and attributes in the output files. Sometimes, an input file may not explicitly contain the required information, but it can be inferred from its structure or comments.

Submission Instructions

This assignment is due by the beginning of class on Thursday, Sept. 18. Create a zip or tar file that contains both your source code (.java) and compiled (.class) files (but do not include any of the Xerces files in it). Send this file to heflin@cse.lehigh.edu with subject line "CSE 497 - Homework #2". Also print out your .java files, and turn them in during class. All of your files should be reasonably commented, including at least a descriptive comment for each class and method. Additionally, each file should have an initial comment that identifies you as its author.