In the last article, we drew blueprints of approximate data with ERD. Now, it is time to make a frame to put the data on the DBMS. The third step in data modeling, logical data modeling, begins by creating relational models.
■ Relations
What comes to your mind when you say ‘handle data’? Many would think of Microsoft Excel or a table.
A relation is a database term that refers to a two-dimensional table where data is stored. It’s very similar to the format we usually use, but some names are different, so let’s take a look.
1. Column
That is what we commonly call ‘fever’. Another term is fields or attributes.
2. Record
Often referred to as a ‘row’ is a record (or tuple) in a database.
3. Instance
The actual data values stored in Relation are instances.
4. Schema
The attribute ‘header’, which represents a column, is a schema.
■ Features of the relay
1. Atomicity of Columns
Column must have different names. In addition, each column value must have only the value defined in the domain, all of which must be a single value. In other words, multiple values should not be included in the Name column, such as ‘David, Thomas’.
2. No Ordering of Columns
There is no particular meaning in the order of the columns.
3. Uniqueness of Records
Records must not be duplicated within the Relation. For example, some customer’s information is as follows: Customer ID: 2 / Name: Lucy Hwang / Phone number: 010-2222-2222 / Address: Rheinstreet 555 There must be only one row with this information.
4. No Ordering of Records
There is no particular meaning in the order of records.
■ Key
The key is the only value that can identify the record in the relation. Depending on uniqueness and irreducibility, the key is divided into four types:
1. Super Key
Super Key satisfies uniqueness. In other words, you can find a certain record by using a single key value. Customer ID is an example Super Key. You can even combine two attributes, such as a name and address, to make it the Super Key.
2. Candidate Key
Candidate key satisfies both uniqueness and irreducibility. Irreducibility means having minimal attributes. Let’s compare [Name+Address] with [Customer ID]. Because there is only one attribute in [Customer ID], it is a Candidate Key which satisfied with both uniqueness and irreducibility.
3. Primary Key (PK)
The primary key is selected from the candidate keys. Is is the most important key. Thus, you can choose only one Primary Key for the relation.
Because it has to distinguish records, NULL values and duplicate values are of course not allowed. Note that NULL means a value that has never been entered or is not applicable. NULL is different from ‘0’ or ‘n/a’. They are simple string here in the example.
4. Alternate Key
If there are more than one candidate key, the remaining candidate keys that are not selected as the primary key are called alternative keys.
5. Foreign Key (FK)
FK is a PK of other referenced relation. In order to query the data easier, we need to link two different relation. FK impose some kind of constraint on the reference of data from other relations so that they cannot refer to strange values or non-existent values. See the image below:
For example, Let’s link customer information with order information. If the customer ID, the primary key of the customer relation, is imported as an foreign key, we can match the information of the customer relation and the order relation.
All fields in a foreign key have the same kind and range of values as the primary key being referenced. They either have the same value as the primary key or null values.