Encyclopedia > Relational model

  Article Content

Relational model

The relational model for database management is a data model based on predicate logic and set theory. Other models are the Hierarchical model and Network model, which predate the relational model in practice but not theory: they were defined a posteriori to describe existing databases after the relational model was defined. Some systems using these older architectures are still in use today in data centers with high data volume needs or where existing systems are so complex it would be cost prohibitive to migrate to systems employing the relational model; also of note are newer so-called object-oriented databases, even if many of them are rather DBMS-construction kits than DBMSs proper.

The relational model was invented by Ted Codd and subsequently maintained and developed by Chris Date and Hugh Darwen, as a general model of data. More recently they have published The Third Manifesto in which they show how the relational model can be extended with object oriented features without compromising its fundamental principles.

The standard language for relational databases, SQL, is only vaguely reminiscent of the mathematical model. Usually it is adopted, despite its restrictions, because it is far and away more popular than any other database language.

The fundamental assumption of the relational models is that all data is represented as mathematical relations, i.e., a subset of the Cartesian product of n sets. In the mathematical model (unlike SQL), reasoning about such data is done in two-valued predicate logic (that is, without a null value), meaning there are two possible evaluations for each proposition: either true or false. The data is operated upon by means of a relational calculus and algebra.

The relational data model permits the designer to create a consistent logical model of the information to be stored. This logical model can be refined through a process of database normalization. A database built on the pure relational model would be entirely normalized. The access plans and other implementation and operation details are handled by the DBMS engine, and should not be reflected in the logical model. This contrasts with common practice for SQL DBMSs in which performance tuning often requires changes to the logical model.

The basic relational building block is the domain, or data type. A tuple is a set of attributes, which are ordered pairs of domain and value. A relvar (relation variable) is an unordered set of ordered pairs of domain and name, which serves as the header for a relation. A relation is an unordered set of tuples. Although these relational concepts are mathematically defined, they correspond loosely to traditional database concepts. A relation is similar to the traditional concept of table. A tuple is similar to the concept of row.

The basic principle of the relational model is the Information Principle: all information is represented by data values. Thus, the relvars are not related to each other at design time: rather, designers use the same domain in several relvars, and if one attribute is dependent on another, this dependency is enforced through referential integrity.

Table of contents
1 Example: Customer Relation

Example Database

An idealized, very simple example of a description of some relvars and their attributes:

Customer(Customer ID, Tax ID, Name, Address, City, State, Zip, Phone)

Order(Order No,Customer ID, Invoice No,Date Placed, Date Promised, Terms, Status)

Order Line(Order No,Order Line No,Product Code, Qty)

Invoice(Invoice No,Customer ID, Order No, Date, Status)

Invoice Line(Invoice No,Line No,Product Code, Qty Shipped)

In this design we have five relvars: Customer, Order, Order Line, Invoice, and Invoice Line. The bold, underlined attributes are candidate keys. The non-bold, underlined attributes are foreign keys.

Usually one candidate key is arbitrarily chosen to be called the primary key and used in preference over the other candidate keys, which are then called alternate keys.

A candidate key is a unique identifier enforcing that no tuple will be duplicated; this would make the relation into something else, namely a bag, by violating the basic definition of a set. A key can be composite, that is, can be composed of several attributes. Below is a tabular depiction of a relation of our example Customer relvar; a relation can be thought of as a value that can be attributed to a relvar.

Example: Customer Relation

Customer ID     Tax ID              Name                 Adresss                 [More fields....]
==================================================================================================
1234567890      555-5512222         Jo Lee               323 Broadway            ...
2223344556      555-5523232         Dorothy Red          1200 Main Street        ...
3334445563      555-5533322         Linda de la Cruz     871 1st Street          ...
4232342432      555-5325523         E. F. Codd           123 It Way              ...

If we attempted to insert a new customer with the ID 1234567890, this would violate the design of the relvar since Customer ID is a primary key and we already have a customer 1234567890. The DBMS must reject a transaction such as this that would render the database inconsistent by a violation of an integrity constraint.

Foreign keys are integrity constraints enforcing that the value of the attribute set is drawn from a candidate key in another relation, for example in the Order relation the attribute Customer ID is a foreign key. A join is the operation that draws on information from several relations at once. By joining relvars from the example above we could query the database for all of the Customers, Orders, and Invoices. If we only wanted the tuples for a specific customer, we would specify this using a restriction condition.

If we wanted to retrieve all of the Orders for Customer 1234567890, we could query[?] the database to return every row in the Order table with Customer ID 1234567890 and join the Order table to the Order Line table based on Order No.

There is a flaw in our database design above. The Invoice relvar contains an Order No attribute. So, each tuple in the Invoice relvar will have one Order No, which implies that there is precisely one Order for each Invoice. But in reality an invoice can be created against many orders, or indeed for no particular order. Additionally the Order relvar contains an Invoice No attribute, implying that each Order has a corresponding Invoice. But again this is not always true in the real world. An order is sometimes paid through several invoices, and sometimes paid without an invoice. In other words there can be many Invoices per Order and many Orders per Invoice. This is a many-to-many relationship between Order and Invoice (also called a non-specific relationship). To represent this relationship in the database a new relvar should be introduced whose role is to specify the correspondence between Orders and Invoices:

OrderInvoice(Order No,Invoice No)

Now, the Order relvar has a one-to-many relationship to the OrderInvoice table, as does the Customer relvar. If we want to retrieve every Invoice for a particular Order, we can query for all orders where Order No in the Order relation equals the Order No in OrderInvoice, and where Invoice No in OrderInvoice equals the Invoice No in Invoice.

Database normalization is usually performed when designing a relational database, to improve the logical consistency of the database design.

There are two commonly used systems of diagramming to aid in the visual representation of the relational model: the entity-relationship diagram (ERD), which is very deficient (but more popular), and the IDEF diagram used in the IDEF1X method created by the U.S. Air Force based on ERDs.

References

  • Communications of the ACM, Vol. 13, No. 6, June 1970, pp. 377-387.
  • Introduction to Database Systems. Date, C. J. 7th ed. 1999.

External Links



All Wikipedia text is available under the terms of the GNU Free Documentation License

 
  Search Encyclopedia

Search over one million articles, find something about almost anything!
 
 
  
  Featured Article
North Lindenhurst, New York

... are 91.6 males. The median income for a household in the town is $59,022, and the median income for a family is $63,642. Males have a median income of $42,403 versus ...

 
 
 
This page was created in 21.9 ms