What are Three Levels of Data Abstraction? Explain in detail each Level of Data Abstraction.
The data model is the abstraction of the real world events by which we create, capture and save the data in the database that are required by the user applications leaving out unnecessary details. As mentioned earlier, during the requirement determination, we gather the information about various business processes and the data required by each process. During this process the database designer is likely to collect lot of information—not all needed in the beginning to model the data. We need to separate the business objects and the detail information that describes the business objects.
As we move from one level (high-level conceptualization to the lower level of implementation) of data model to another, we add more detail to the business objects. We will follow the levels of abstraction of data as we move from user level data requirements to the physical representation of the data in the database. We call this as three -level architecture of database design.
Three Levels of Data Abstraction
- External Level
- Conceptual Level
- Internal Level
The physical implementation of the database in the selected database management software and the hardware system selected follow these three levels, as shown in Figure one below
This level is also known as External View. The identification of data requirements of each user group involves identification of user view of data. We first represent what data is required by each user group, and whose data is to be stored in the underlying database. Database design meets the data requirements of the organization at large.
Thus, a user group is division or department of the organization. The data requirements for such division or department may be just a small data-set needed to meet their functional requirements. Thus, each division or department may require data-set, which may be distinct; yet, there may be some overlap with other related division or department. Let us understand this concept by taking example.
Let us take a hypothetical university, Your Area Learning University, for which we want to design database. We identify the user groups as follows:
- HRD View — HRD that basically needs data on employees, that is, support staff and academic staff, employee benefit information, and so on.
- Enrollment View — Registrar Office that needs data on student registration and student’s exam scores.
- Accounts View — Student Accounts (a division of Business Office) that needs data on financial aid, scholarships and student payment of fees.
- Student Life View — Student Life that needs data on those students who live on campus in the hostels
- Athletic View — Athletic Department that needs data on those students who take part in one or more sports representing university at regional, state, and national level sports events.
Note that here we have identified only few user groups within the university, but there may be more user groups which are not covered in our discussion. We want to design database that will meet collectively data requirements of all of the above groups.
Thus, we need to know what is the need in terms of data that can be met from the database we design. Each user group will look at only what they need without regard to others. In fact, each user group may think that we are designing database for them only and hence their view of data. Thus, each user view of data represents one external view. When we combine all the external views, the resulting design must meet the data need of all the user groups.
HRD External View:
HRD needs data on employees. Since information collected from support staff is not same as the information collected from the academic staff, the HRD has suggested that we store some distinct information about academic staff and non-academic staff in separate tables. Only common set of information are stored in a one table Employee.
This is just design criteria, and it is not mandatory to have three tables Employee, Staff, and Professor. We will talk in detail about the table design later on where we will explain how to decide on storing information in different tables. With this remark, we have the following model for HRD.
- Employee table will hold employee information; Staff table will hold data on special
- Characteristics Professor table will hold data on academic personnel.
- Job table will hold different job classification
- Plan table will hold data on various health and life insurance plans.
- Benefit table will hold data on benefits opted by employee.
The external view for HRD is shown in Chens Notations in ERD as follows:
Registrar Office External View: Registrar Office needs data on students, student registration in each semester, exam scores and grades. Student table will hold data on student information, enrollment table student registration, Course table courses and Class table schedule of classes. The external view for Registrar Office is shown in Chens Notations in ERD as follows.
Student Account External View:
Student Account Division is a part of the Business Office that deals with the student fees, scholarships and financial aids. Student Account needs data on students, student registration in each semester, fees assessment, scholarships award, and financial aid. The external view for Student
Account is shown in Chens Notations in ERD as follows:
As you notice from the above scenarios, there is an overlap of information among different user groups. The designer of database must combine data requirements from all users before coming up with the final database.
The organizational data requirements are met by this level of abstraction. We often use the word “conceptualization,” which means the overall idea about the situation at hand. Having conceptualized, it is easy to conceive it. In our database design, the conceptual level represents the organizational view of the data that combines or integrates all external views into a single view.
It is an organization-wide representation of the data as viewed by the high-level managers. It is at this level that we identify the main data objects and describe with minimum detail. This is the place where we look at the data in terms of relationship that may exist among them.
To get a better view of the data presentation, we use most commonly used conceptual model: the Entity Relationship Model (ERM). Using ERM we create conceptual schema that is used to design the database. Below figure demonstrates the use of ERM, to create conceptual schema. This method has an advantage it combines all the external views into organizational data requirements and also depicts the relationships that exist among the data.
While we create conceptual schema using ERM, it should be noted that ER model is independent of database software (DBMS) that we can use to create our database. It is also independent of the hardware on which we implement the model. Thus ER model is both independent of software and hardware platforms. This offers us the flexibility at the conceptual level modeling since any change in the hardware or database management software will have no effect on conceptual level.
The internal model is specific to the selection of DBMS. We implement the conceptual model to this specific adaptation of the DBMS. Essentially what we do is mapping of the conceptual model to the selected models characteristics and constraints. That means that the internal model is DBMS dependent. Hence the change in the DBMS software may require change in the mapping of the ER model in order to meet the DBMS requirements. The conceptual model is not affected. This is known as the logical independence.
Let us say, we decide to use relational DBMS, then our conceptual model will be mapped to the internal model of the RDBMS. Doing so, our entities will be mapped to tables. However, it does not matter what hardware platform we select to install the DBMS. This makes the internal model independent of the hardware because it is not affected by the choice of the computer we select to install the software.