Difference between revisions of "Databases"
From Suhrid.net Wiki
Jump to navigationJump to searchLine 97: | Line 97: | ||
* Query (expression) on a set of results produces relation as a result. | * Query (expression) on a set of results produces relation as a result. | ||
* Simplest query in RA is the name of the relation e.g. Student which produce the contents of the Student relation | * Simplest query in RA is the name of the relation e.g. Student which produce the contents of the Student relation | ||
+ | * Operators to filter, slice and combine relations. | ||
+ | * Select operator : Sigma (cond) Relation name. Condition is a filter and the result will be a subset of the relation | ||
+ | * Project operator : Select picks rows, project picks columns. Pi (col name) Relation Name. | ||
+ | * To pick both rows and cols, combine the select and project operator. e.g. Project(sid,name) (Select(gpa > 3.7) student) | ||
+ | * Cross Product : By default RA is based on sets and doesnt consider duplicates. But SQL uses multisets/bags and includes duplicates. | ||
+ | Cross product between two relations combines all attributes (columns) and every combination of tuples between the relation. m X n attrs and a X B tuples. |
Revision as of 06:08, 16 January 2014
Contents
Relational Databases
- Set of named relations (tables)
- Each relation has a set of named attributes (columns)
- Each tuple (row) has a value for each attribute
- Each attribute has a type (or domain)
- All queries return relations (tables), so it can said to be compositional or closed.
XML Data
- Alternative to relational model to store data
- Standard for data representation and exchange
- Document format similar to HTML, however tags represent content instead of formatting.
- XML consists of tagged elements which can be nested, attributes of elements and the text (values/data)
- XML document can be thought of as a tree with values/text forming the leaves.
Relational vs XML
Relational | XML | |
---|---|---|
Structure | Tables | Hierarchical/Tree |
Schema | Fixed in advance | Flexible, self-describing. e.g. all elements need not have same attributes. (In relational, we would need to have NULL values). Inconsistencies in the structure is NOT a problem. |
Queries | Relatively simple | Complicated compared to Relational / SQL. |
Ordering | Fundamentally UNORDERED. e.g. Order by clause is required in SQL in case order is desired. | Ordering is implied - e.g. the way the text is laid out in the XML document or in the Stream. |
Implementation | Mature - native implementation | Add - on. (Underlying implementation is relational) |
Well formed XML
- Single root element
- XML parser is used to validate XML.
- DOM parser - treats XML as a document, SAX parser - treats XML as a stream.
- Matched tags, proper nesting
- Unique attributes within each element
Displaying XML
- Since XML occurs a lot on the internet, a convenient way is required to display XML.
- Use rule based languages such as CSS, XSL to translate XML to HTML.
- XML document is fed to a CSS/XSL interpreter alongwith rules and a HTML document is generated.
Valid XML
- Valid XML should adhere to requirements for well formed XML
- Also should adhere to content-specific specifications : DTD/XSD
- DTD is a grammar-like language for specifying elements, attributes, nesting, ordering, #occurrences
- DTD/XSD offers the benefits of "typing" - can serve as a specification. Useful for data exchange.
- Having no DTD offers flexibility of data.
- DTD also can specify pointer types - an element can have an ID and another element can refer to it using an IDREF.
XSD
- Like DTD, but more comprehensive.
- XSD is written in XML itself.
- Everything in DTD is string, in XSD e.g., we can specify that an attribute should be of a specific type.
- Pointers can be typed, so we can enforce that a reference is made to a particular type and not just global like in DTD's
- Also specify min/max constraints on elements
JSON
- Like XML, JSON can be thought of as a data model - as an alternative to the relational data model.
- Useful for semi-structured data
- Designed originally for serializing data objects
- JSON is human readable - often used for data interchange
- JSON consists of base values, objects (label-value pairs), and arrays to hold either. Structure is recursive.
- JSON vs Relational - JSON differs in how data is stored, schema is self-describing, ordering is present, query support is minimal unlike XQuery, XPath, XSLT, and is implemented typically in prog languages
- JSON vs XML - XML more verbose/complex than JSON, like DTD, XSD, JSON has JSON schema.
- XML model dont match with programming languages. Interface is clunky. This is called impedence mismatch - similar to Relational and prog languages.
- JSON has a more direct mapping with prog languages.
Valid JSON
- Adheres to basic structural requirements of JSON. (label-value pairs, arrays of values)
- Can use JSON schema for semantic validation.
Relational Algebra
- Relational algebra is the formal language that is the underpinning for implemented languages like SQL
- Query (expression) on a set of results produces relation as a result.
- Simplest query in RA is the name of the relation e.g. Student which produce the contents of the Student relation
- Operators to filter, slice and combine relations.
- Select operator : Sigma (cond) Relation name. Condition is a filter and the result will be a subset of the relation
- Project operator : Select picks rows, project picks columns. Pi (col name) Relation Name.
- To pick both rows and cols, combine the select and project operator. e.g. Project(sid,name) (Select(gpa > 3.7) student)
- Cross Product : By default RA is based on sets and doesnt consider duplicates. But SQL uses multisets/bags and includes duplicates.
Cross product between two relations combines all attributes (columns) and every combination of tuples between the relation. m X n attrs and a X B tuples.