Difference between revisions of "Querying XML"
From Suhrid.net Wiki
Jump to navigationJump to search| Line 21: | Line 21: | ||
| == Sample queries == | == Sample queries == | ||
| − | + | <syntaxhighlight lang="xml">  | |
| * doc("Bookstore.xml")/Bookstore/Book/Title - returns titles of all books | * doc("Bookstore.xml")/Bookstore/Book/Title - returns titles of all books | ||
| * doc("Bookstore.xml")/Bookstore/(Book | Magazine)/Title - titles of all books or magazines | * doc("Bookstore.xml")/Bookstore/(Book | Magazine)/Title - titles of all books or magazines | ||
| Line 28: | Line 28: | ||
| * doc("Bookstore.xml")//* - Will print the whole tree for the root, then subtree for the child etc | * doc("Bookstore.xml")//* - Will print the whole tree for the root, then subtree for the child etc | ||
| * doc("Bookstore.xml")/Bookstore/Book/data(@ISBN) - Data operator needs to be specified | * doc("Bookstore.xml")/Bookstore/Book/data(@ISBN) - Data operator needs to be specified | ||
| + | </syntaxhighlight> | ||
| * doc("Bookstore.xml")/Bookstore/Book[@Price < 90] - Condition, price < 90 : Will print the whole book | * doc("Bookstore.xml")/Bookstore/Book[@Price < 90] - Condition, price < 90 : Will print the whole book | ||
| * doc("Bookstore.xml")/Bookstore/Book[@Price < 90]/Title - Above, but return only title. | * doc("Bookstore.xml")/Bookstore/Book[@Price < 90]/Title - Above, but return only title. | ||
Revision as of 22:15, 8 February 2014
Intro
- Not as mature as querying relational databases
- No underlying algebra
- XPath : Path expressions and conditions
- XSLT : XPath + Transformations, output processing
- XQuery : XPath + full featured query language
- XLink, XPointer : Use XPath as a component
XPath
- Think of XML as a tree
- Expressions in XPath as navigations down/across the tree with conditions
- / - root element + separator, Element name, * is wildcard, @ for attribute
- // - any descendant of the current element including self
- condition in square bracket. [price < 50]. Also [] used as array access.
- Many built-in functions : e.g. contains(s1, s2) : true/false. name() : returns element tag name
- Navigation axes : e.g. parent, following-sibling, descendants
- XPath queries operate on and return sequence of elements for XML document & XML stream
- Sometimes result of XPath query can be expressed in XML, but not always
Sample queries
 
* doc("Bookstore.xml")/Bookstore/Book/Title - returns titles of all books
* doc("Bookstore.xml")/Bookstore/(Book | Magazine)/Title - titles of all books or magazines
* doc("Bookstore.xml")/Bookstore/*/Title - wildcard
* doc("Bookstore.xml")//Title - any Title element anywhere in the tree - Double slash
* doc("Bookstore.xml")//* - Will print the whole tree for the root, then subtree for the child etc
* doc("Bookstore.xml")/Bookstore/Book/data(@ISBN) - Data operator needs to be specified
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90] - Condition, price < 90 : Will print the whole book
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90]/Title - Above, but return only title.
- doc("Bookstore.xml")/Bookstore/Book[Remark]/Title - Existence condition, Book must have a remark element
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90 and Authors/Author/Last_Name = "Ullman" and Authors/Author/First_Name = "Jennifer" ]/Title : Bigger condition. The second part is actually a "there exists". So actually not doing an AND.
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90 and Authors/Author[Last_Name = "Ullman" and First_Name = "Jennifer" ]/Title : This is the correct one.
- doc("Bookstore.xml")//Authors/Author[2] - Return the second author element of each Authors subelement
- doc("Bookstore.xml")/Book/[contains(Remark, "Great")]/Title : contains function
- doc("Bookstore.xml")//Magazine[Title=doc("Bookstore.xml")//Book/Title] : Self-join. Condition is satisified if there is SOME element that meets it. Implicit existential quantification.
- doc("Bookstore.xml")/Book//*[name(parent::*) != 'Bookstore' and name(parent::*) != 'Book'] : All elements whose parent element is not bookstore or book. * after parent:: says match any tag of the parent.
- doc("Bookstore.xml")/Bookstore/(Book | Magazine)[Title = following-sibling::*/Title] : All books and magazines that have a non-unique title. Similarly, preceding-sibling.
- doc("Bookstore.xml")/Bookstore/(Book | Magazine)[Title = following-sibling::Book/Title] : Instead of star in the axes, we specify an element.
- doc("Bookstore.xml")//Book[count(Authors/Author[contains(First_Name, 'J')]) = count(Authors/Author/First_Name)] - Universal quantification (for all). Every author's first_name equals J.
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90 and Authors/Author[Last_Name = "Ullman" and count(Authors/Author[First_Name = "Jennifer"] = 0] : Similar trick, simulating "and first_name != 'Jennifer'"
XQuery
- Xquery is an expression language also known as a compositional language.
- Like a relational algebra - expression on a type of data will be an answer in the same type of data.
- In relational model, type of data is relations. In XML, the type is "sequence of elements".
- Sequence can come from XML Document, XML Stream.
- XQuery uses XPath. Every XPath expression is an XQuery expression.
- Commonly used XQuery expression is the FLWOR expression :
For $var in expr Let $var := expr Where condition Order By expr Return expr
- Everything is optional except the return statement
- For and let clause can be repeated multiple times and interleaved.
- Possible to mix query language with hardcoded XML that we want in the result.
Xquery examples
- Variable b is bound to each of the Book elements in a loop.
for $b in doc("BookstoreQ.xml")/Bookstore/Book where $b/@Price < 90 and $b/Authors/Author/Last_Name = "Ullman" return $b/Title
- For clause is an iterator, let clause is an assignment. Find all price attr's in the DB and assign to plist variable as a list.
<Average>
  { let $plist := doc("BookstoreQ.xml")/Bookstore/Book/@Price
    return avg($plist) }
</Average>
- If we want something in the return block to be evaluated, then we need to put it in curly brackets - ${n}
for $n in distinct-values(doc("BookstoreQ.xml")//Last_Name)
return <Last_Name> {$n} </Last_Name>
- Existential quantification :
for $b in doc("BookstoreQ.xml")/Bookstore/Book
where some $fn in $b/Authors/Author/First_Name
         satisfies contains($b/Title, $fn)
return <Book>
          { $b/Title }
          { $b/Authors/Author/First_Name }
       </Book>
- Universal quantification :
for $b in doc("BookstoreQ.xml")/Bookstore/Book
where every $fn in $b/Authors/Author/First_Name
         satisfies contains($fn, "J")
return $b
- Self-join
for $b1 in doc("BookstoreQ.xml")/Bookstore/Book
for $b2 in doc("BookstoreQ.xml")/Bookstore/Book
where $b1/Authors/Author/Last_Name = $b2/Authors/Author/Last_Name <!-- EXISTENTIAL QUANTIFICATION -->
return
   <BookPair>
      <Title1> { data($b1/Title) } </Title1>
      <Title2> { data($b2/Title) } </Title2>
   </BookPair>
