In my childhood, when I wasn’t familiar with the Internet, I had an Encyclopedia Britannica at home. Encyclopedia Britannica is a large dictionary with hundreds of pages in a book, and it was a series of encyclopedias with a total of 27 volumes. It’s a dictionary I always open when I do my homework, and it’s like my own little Google, back then.
The more I studied the database, the more I remembered the first time I learned how to find a dictionary. Keyword search has become a trend in just 10 years after the days with paper dictionaries, and now we can search by voice. We live in an era full of data, and finding and using data has become so natural in our daily lives. Now, let’s look at the database and database management systems, which are a collection of this data.
■ Want to find/use data faster? It is time to do ‘data management‘.
Imagine looking for clothes in a messy closet. If you look here and there, you may find laundry or clothes you want to wear after losing weight. If you search for clothes of one kind or another, it will take a long time to find the clothes you want.
The same goes for data. The more data you have or the more unstructured data, the more time it takes to find the data you need. The data management device is designed to address this issue. A data management device is an auxiliary device that stores, manages, and quickly retrieves the data you need.
Data management evolved rapidly in the 1970s when IBM created SQL and cloud computers developed in the mid-2000s.
■ System before database era: File Processing System
The structure of the file processing system is shown below. It’s easy to understand when you think of Windows folders. It is a system that manages data individually by folder (by application).
Because the application you want to use directly manages the data, and the following problems arise:
■ Issues with File Processing Systems
1. Data Dependency Issue
Restrictions arise where data is used only by certain users, HW, and SW. In the figure above, Application 2 is unable to access File 1 or File 3, resulting from physical and logical dependencies.
2. Data Redundancy Issue
Since data needs to be generated per application, it is a problem that the same data can be generated multiple times. The figure shows that file 1 has the same gray data as file 2.
Duplicate data results in inconsistent problems. That is, when modifying any data, it becomes difficult to maintain because all the same scattered data must be modified. Similarly, the same data takes up storage space, making it less economical, and difficult to maintain the same level of security for the same data, making it less secure.
3. Integrity Issue
The integrity of the data seems a little difficult. In short, the data must be very accurate. You may wonder what joke it is, but let’s think about it.
The data we use should be ‘limited’. This constraint refers to a range of possibilities, such as ‘Data in this file cannot exceed 10 digits’.
The more redundant data, the more difficult it is to continuously monitor individual data, and conversely, the greater the possibility of compromising the integrity of the data.
4. Concurrent Sharing Issue
In a file processing system, simultaneous access by multiple users results in inconsistent data management. For example, at the same time, user A and user B access and work on data 1. At this time, the simultaneous work should be reflected and shared with each other, but the file processing system does not provide these functions properly.
■ Structure of Database Management System
A database management system (DBMS) is a software package for organizing, storing, managing, and using data stored in a database. The biggest feature is that the areas of use and management of data are clearly distinguished.
DBMS is divided into three-level structures, which are external steps consisting of views (external schemas) | conceptual steps consisting of conceptual schemas | internal steps consisting of internal schemas.
1. External Level: Views that users see
to gain logical independence of data in the process of external–concept mapping.
2. Conceptual Level: Data size and data structure
to gain physical independence of data in the process of conceptual–internal mapping.
3. Internal Level: Physical elements, such as which disks are stored where and how large they are.
We have explored the structure and problems of existing file processing systems, and the structure of DBMS. In the next article, we will learn about the strengths and weaknesses of DBMS, and the language system architecture of the database.