Relational Database


  A relational database is a database that conforms to the relational model, and refers to a database's data and schema (the database's structure of how that data is arranged). Common usage of the term "Relational database management system" technically refers to the software used to create a relational database, but sometimes mistakenly refers to a relational database.

Relations or tables

Main articles: Relation (mathematics) and Table (database)

A relation is defined as a set of tuples that all have the same attributes. This is usually represented by a table, which is organized into rows and columns. In a relational database, all of the data stored in a column should be in the same domain. In practice this means that values stored in a single column must all be of the same data type and conform to the same constraints.

The relational model specifies that the tuples of a relation should have no specific order and that the tuples, in turn, should impose no order on the attributes. In the relational model specific tuples are retrieved by using the projection operator over the relation. The relational selection operation is equivalent to the SQL SELECT query, possibly with a WHERE clause to limit results. In the relational model attributes must be referenced explicitly by name in all operations, while the SQL standard allows both unnamed columns in result sets and the shorthand asterisk (*) notation in queries.

The SQL standard requires columns to have a defined order. All data stored in a computer must have an order, as the memory of a computer is linear. Also, when the data is returned, there must be an order in which the data is returned because transfer protocols are also linear. Note, however, that in SQL the order of columns and rows returned in a result set is never guaranteed unless explicitly specified by the user.


Relvars

Main articles: Relvar and View (database)

A relvar is a "relation variable". In a relational database, all data are stored and accessed via relations. The data that are actually stored in the database are stored as relations. These relations are sometimes called "base relvars". This is equivalent to a "table". Other relvars do not have their data stored in them, but are a result of applying relational operations, to other relvars. These relvars are sometimes called "derived relvars", meaning that their information is derived from other sources. These are equivalent to "views" or "queries". Derived relvars are convenient in that though they may grab information from several relvars, it is presented externally as a single relvar for a simpler perspective. Also, it can be used as an abstraction layer.

Derived relvars are not always considered part of a relational database, partially because they are not essential to the functioning of the database.




Main article: data domain

A data domain (or usually just domain), is the set of possible values for a given attribute. Because it does constrain the values the data can hold, it could be considered a constraint, but because attributes must specify a domain, it could just be considered part of the relation's definition. Mathematically, a domain can be expressed as "all values for this attribute must be an element of the specified set."

In relational database implementations every data type has an associated data domain. The character data value 'ABC', for instance, is not part of the data domain for an integer attribute. The integer value '123', however, is part of the data domain for an integer attribute


Constraints



Main article: Constraint

Constraints are a way of providing restrictions on the kinds of data that can be stored in the relations. These are usually defined (formally) in the form of expressions that result in a boolean value, indicating whether or not the constraint holds. Constraints are a way of implementing business rules into the database.

Under the strictest sense, constraints are not considered part of the relational database, but because of the integral role that they play in organizing data, they are usually considered part of the database.

In practice, constraints provide users the ability to further restrict and refine the data domain for a given attribute. For instance, constraints can be used to restrict the data domain of an integer attribute to the values '1' to '10'.


Keys



Main article: Superkey

A tuple usually represents some object and its associated data, whether that object is a physical object or a concept. A key is a kind of constraint that requires that the object, or critical information about the object, is not duplicated. For example, a family might like to have a constraint such that no two people in the immediate family have the same name. If information about family members were stored in a database, a key could be placed over the family member's name. In a University, they have no such luxury. Each student is typically assigned a Student ID, which are used as keys for individual students stored in the school database. Keys can have more than one column, for example, a nation may impose a restriction that a province can't have two cities by the same name. So, when cities are stored in a relation, there would be a key defined over province and city name. This would allow for two different provinces to have a town called Springfield (because their province would be different), but not two cities with the same name in the same province. A key over more than one attribute is called a compound key. Theoretically, a key can even be over zero attributes. This would enforce that there cannot be more than one tuple in the relation.


Foreign keys

Main article: Foreign key

A foreign key is not a key by the previous definition. Rather, a foreign key is a reference to a key in another table, meaning that the referencing tuple has, as part of its attributes, the values of a key in the referenced tuple that corresponds to the relationship.

A foreign key could be described formally as "For all tuples in the referencing relation projected over the referencing attributes, there must exist a tuple in the referenced relation projected over those same attributes such that the values in each of the referencing attributes match the corresponding values in the referenced attributes".

Transition constraints

Main article: Transition constraint

A transition constraint is a way of enforcing that the data doesn't enter an impossible state because of a previous state. For example, it shouldn't be possible for a person to change from being "married" to being "single, never married". The only valid states after "married" might be "divorced", "widowed", or "deceased".


Other constraints

Other constraints of various different kinds can be created to enforce various kinds of business rules. They can be as simple as "the number of cars an individual owns must be non-negative" or complex patterns like "If the work that an employee performs is 'Hazardous Materials Transport' then that employee's age must be at least 18 years, and the employee's certifications must include 'Hazmat endorsement', and company insurance for that employee must include life insurance."

Stored procedures

Main article: Stored procedure

A stored procedure is executable code that is associated with the database. Stored procedures usually store how to perform common operations, like inserting a tuple into a relation, or gathering statistical information about usage patterns. Frequently they are used as an application programming interface (API) for security or simplicity. These are usually written as Imperative programming code extending the Data Definition Language and/or the Data Manipulation Language for the DBMS.

Stored procedures are not always considered part of a relational database, partly because they are not essential to the functioning of the database.


Indices

Main article: Index (database)

An index is a way of providing quicker access to the data in a relational database. Indices can be created on any combination of attributes on a relation. Then when tuples in a relation need to be looked up, similar to how a book's index works, the index can be accessed. Rather than having to check all of the tuples, the index tells the DBMS where the tuple is. Indices are usually implemented via B+ trees.

Indices are usually not considered part of the database, as they are considered an implementation detail, though indices are usually maintained by the same group that maintains the other parts of the database.


Relational operations

Main article: Relational algebra

Queries made against the relational database, and the derived relvars in the database are expressed in a relational calculus or a relational algebra. In his original relational algebra, Dr. Codd introduced eight relational operators in two groups of four operators each. The first four operators were based on the traditional mathematical set operations:
The union operator combines the tuples of two relations and removes all duplicate tuples from the result. The relational union operator is equivalent to the SQL UNION operator.
The intersection operator produces the set of tuples that two relations share in common. Intersection is implemented in SQL in the form of the INTERSECT operator.
The difference operator acts on two relations and produces the set of tuples from the first relation that do not exist in the second relation. Difference is implemented in SQL in the form of the EXCEPT or MINUS operator.
The cartesian product of two relations is a join that is not restricted by any criteria, resulting in every tuple of the first relation being matched with every tuple of the second relation. The cartesian product is implemented in SQL as the CROSS JOIN join operator.

The remaining operators proposed by Dr. Codd involve special operations specific to relational
The selection, or restriction, operation retrieves tuples from a relation, limiting the results to only those that meet a specific criteria, i.e. a subset of terms of set theory. The SQL equivalent of selection is the SELECT query statement with a WHERE clause.
The projection operation is essentially a selection operation in which duplicate tuples are removed from the result. The SQL GROUP BY clause, or the DISTINCT keyword implemented by some SQL dialects, can be used to remove duplicates from a result set.
The join operation defined for relational databases is often referred to as a natural join. In this type of join, two relations are connected by their common attributes. SQL's approximation of a natural join is the INNER JOIN join operator.
The relational division operation is slightly more complex operation, which involves essentially using the tuples of one relation (the dividend) to partition a second relation (the divisor). The relational division operator is effectively the opposite of the cartesian product operator (hence the name).

Other operators have been introduced or proposed since Dr. Codd's introduction of the original eight including relational comparison operators and extensions that offer support for nesting and hierarchical data, among others.

No comments: