3. Assignment - XML Technologies - Winter Term 2015 (Release date: Nov 5 - Date due: Nov 11, 8:00 am)
Please send solutions CC: to christian.grün@uni-konstanz.de as Alex will be away this week.
1. Task - Unicode
The following binary octets represent a string encoded in UTF-8:
01010011 01101101 11000011 10110110 01110010 01100111 11000011 10100101 01110011
01100010 01101111 01110010 01100100 00100000 11110000 10011101 10000100 10011110
  1. Decode the string into a sequence of Unicode codepoints.
  2. Encode the codepoints back into octets using the UTF-16 encoding.
  3. Which string is encoded?
    Remark: Make sure you understand the UTF-8 and UTF-16 encoding schemes and are able to solve tasks 1. and 2. on paper without the help of a converter tool.
2. Task - XPath Rewritings
To speedup query execution, XPath location paths can be rewritten to ignore redundant location steps or to favour operations which are supported more efficiently by a specific implementation. Find rewritings for the following queries, which are equivalent to the original versions, but only use forward axes (i.e., self, child, attribute, descendant, descendant-or-self, following-sibling, following) and attributes. Basing rewrites solely on abbreviations is not enough! (see lecture on XPath, slide 13)
Example: //jack[ancestor::john]//john/descendant::jack
  1. ./child::jamie/child::jerry/child::jeremie[text() = "joe"]/../..
  2. /descendant-or-self::node()/child::james
  3. ./child::node()/parent::node()
  4. ./child::jim/preceding-sibling::jack
  5. Why is it not possible to rewrite the following query?
    ./descendant::jason/preceding::jasper
3. Task - XQuery Semantics
Evaluate the following XQuery expressions and explain the results:
  1. <A>100</A> + <B><C>1</C><C>{ max((1,2)) }</C></B>
  2. (1, 2) > (3, 2, 1)
  3. ( ) != 1
  4. /descendant-or-self::a
  5. (0,2,1,2)[.]
  6. "foo"["bar"]
  7. <n/> eq <n> <n/> </n>
  8. ('a','bc') ! string-length(.) ! (. = 1)
Discussion of 3. Assignment - XML Technologies - Winter Term 2015