10. Assignment - XML Technologies - Winter Term 2015 (Release date: Jan 14 - Date due: Jan 20, 8:00 am)

During this assignment we take a deeper look at two of the XML encodings that have been presented in the lecture. The document doc.xml serves as the input for several of the exercises.

<A>
 <B><C/><D/><E><F/><G/></E></B>
 <H><I/><J/></H>
</A>
      
1. Task - Pre / Dist / Size
  1. Create the Pre/Dist/Size table for doc.xml.
  2. Show the state of the table after the application of the following XQuery Update statement: insert node <X1><X2/></X1> into //C
  3. If an update changes the number of nodes in the table, the Pre, Dist and Size values of certain tuples have to be adjusted. Let's consider <X1/> as context node c. Use XPath axes to describe each of the three sets of tuples for which the Pre, the Size and the Dist value have changed after the insertion of <X1/>.
  4. Node E is now deleted from the result document of step (2.). For which tuples do we have to adjust the Pre, Size and Dist values as a consequence of the deletion?
2. Task - ORDPATH

The ORDPATH (O’Neil et al. ORDPATHs: Insert-Friendly XML Node Labels. SIGMOD 2004) encoding is a hierarchic, dynamic, prefix-labeling scheme inspired by the Dewey Decimal Classification. In contrast to Pre/Dist/Size, ORDPATH requires no adjustment of existing labels after the insertion or deletion of nodes.

  1. Create the initial labels for doc.xml.
  2. Apply the following XQuery Update statements sequentially and show the final state of the ORDPATH labels:
    • insert node <X3/> before //G
    • insert node <X1/> after //F
    • insert node <X2/> after //X1
  3. Based on the prefix tree on slide 14 of the XML Encoding lecture:
    • Determine the compressed bit pattern for the label 1.-353.21.-25
    • Determine the hierarchical label for the bit pattern: 0100111000000010100111100010000000101100001
3. Task - Comparison

ORDPATH and Pre/Dist/Size have obviously very different qualities. Deciding to use one or the other as the foundation for an XML database is highly dependent on the use case and has a number of consequences. Based on your experiences so far, give two advantages for each of the encodings.

Discussion of 10. Assignment - XML Technologies - Winter Term 2015