Difference between revisions of "Querying XML"
From Suhrid.net Wiki
Jump to navigationJump to search|  (→XQuery) |  (→XQuery) | ||
| Line 59: | Line 59: | ||
| * Everything is optional except the return statement | * Everything is optional except the return statement | ||
| * For and let clause can be repeated multiple times and interleaved. | * For and let clause can be repeated multiple times and interleaved. | ||
| + | * Possible to mix query language with hardcoded XML that we want in the result. | ||
Revision as of 02:08, 8 February 2014
Contents
Intro
- Not as mature as querying relational databases
- No underlying algebra
- XPath : Path expressions and conditions
- XSLT : XPath + Transformations, output processing
- XQuery : XPath + full featured query language
- XLink, XPointer : Use XPath as a component
XPath
- Think of XML as a tree
- Expressions in XPath as navigations down/across the tree with conditions
- / - root element + separator, Element name, * is wildcard, @ for attribute
- // - any descendant of the current element including self
- condition in square bracket. [price < 50]. Also [] used as array access.
- Many built-in functions : e.g. contains(s1, s2) : true/false. name() : returns element tag name
- Navigation axes : e.g. parent, following-sibling, descendants
- XPath queries operate on and return sequence of elements for XML document & XML stream
- Sometimes result of XPath query can be expressed in XML, but not always
Sample queries
- doc("Bookstore.xml")/Bookstore/Book/Title - returns titles of all books
- doc("Bookstore.xml")/Bookstore/(Book | Magazine)/Title - titles of all books or magazines
- doc("Bookstore.xml")/Bookstore/*/Title - wildcard
- doc("Bookstore.xml")//Title - any Title element anywhere in the tree - Double slash
- doc("Bookstore.xml")//* - Will print the whole tree for the root, then subtree for the child etc
- doc("Bookstore.xml")/Bookstore/Book/data(@ISBN) - Data operator needs to be specified
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90] - Condition, price < 90 : Will print the whole book
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90]/Title - Above, but return only title.
- doc("Bookstore.xml")/Bookstore/Book[Remark]/Title - Existence condition, Book must have a remark element
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90 and Authors/Author/Last_Name = "Ullman" and Authors/Author/First_Name = "Jennifer" ]/Title : Bigger condition. The second part is actually a "there exists". So actually not doing an AND.
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90 and Authors/Author[Last_Name = "Ullman" and First_Name = "Jennifer" ]/Title : This is the correct one.
- doc("Bookstore.xml")//Authors/Author[2] - Return the second author element of each Authors subelement
- doc("Bookstore.xml")/Book/[contains(Remark, "Great")]/Title : contains function
- doc("Bookstore.xml")//Magazine[Title=doc("Bookstore.xml")//Book/Title] : Self-join. Condition is satisified if there is SOME element that meets it. Implicit existential quantification.
- doc("Bookstore.xml")/Book//*[name(parent::*) != 'Bookstore' and name(parent::*) != 'Book'] : All elements whose parent element is not bookstore or book. * after parent:: says match any tag of the parent.
- doc("Bookstore.xml")/Bookstore/(Book | Magazine)[Title = following-sibling::*/Title] : All books and magazines that have a non-unique title. Similarly, preceding-sibling.
- doc("Bookstore.xml")/Bookstore/(Book | Magazine)[Title = following-sibling::Book/Title] : Instead of star in the axes, we specify an element.
- doc("Bookstore.xml")//Book[count(Authors/Author[contains(First_Name, 'J')]) = count(Authors/Author/First_Name)] - Universal quantification (for all). Every author's first_name equals J.
- doc("Bookstore.xml")/Bookstore/Book[@Price < 90 and Authors/Author[Last_Name = "Ullman" and count(Authors/Author[First_Name = "Jennifer"] = 0] : Similar trick, simulating "and first_name != 'Jennifer'"
XQuery
- Xquery is an expression language also known as a compositional language.
- Like a relational algebra - expression on a type of data will be an answer in the same type of data.
- In relational model, type of data is relations. In XML, the type is "sequence of elements".
- Sequence can come from XML Document, XML Stream.
- XQuery uses XPath. Every XPath expression is an XQuery expression.
- Commonly used XQuery expression is the FLWOR expression :
For $var in expr Let $var := expr Where condition Order By expr Return expr
- Everything is optional except the return statement
- For and let clause can be repeated multiple times and interleaved.
- Possible to mix query language with hardcoded XML that we want in the result.
