Database Design can be defined as a set of procedures or collection of tasks involving various steps taken to implement a database. Following are some critical points to keep in mind to achieve a good database design:
- Data consistency and integrity must be maintained.
- Low Redundancy
- Faster searching through indices
- Security measures should be taken by enforcing various integrity constraints.
- Data should be stored in fragmented bits of information in the most atomic format possible.
However, depending on specific requirements above criteria might change. But these are the most common things that ensure a good database design.
What are the Following Steps that can be taken by a Database Designer to Ensure Good Database Design?
Step 1: Determine the goal of your database, and ensure clear communication with the stakeholders (if any). Understanding the purpose of a database will help in thinking of various use cases & where the problem may arise & how we can prevent it.
Step 2: List down all the entities that will be present in the database & what relationships exist among them.
Step 3: Organize the information into different tables such that no or very little redundancy is there.
Step 4: Ensure uniqueness in every table. The uniqueness of records present in any relation is a very crucial part of database design that helps us avoid redundancy. Identify the key attributes to uniquely identify every row from columns. You can use various key constraints to ensure the uniqueness of your table, also keep in mind the uniquely identifying records must consume as little space as possible & shall not contain any NULL values.
Step 5: After all the tables are structured, and information is organized apply Normalization Forms to identify anomalies that may arise & redundancy that can cause inconsistency in the database.
Primary Terminologies Used in Database Design
Following are the terminologies that a person should be familiar with before designing a database:
- Redundancy: Redundancy refers to the duplicity of the data. There can be specific use cases when we need or don’t need redundancy in our Database. For ex: If we have a banking system application then we may need to strictly prevent redundancy in our Database.
- Schema: Schema is a logical container that defines the structure & manages the organization of the data stored in it. It consists of rows and columns having data types for each column.
- Records/Tuples: A Record or a tuple is the same thing, basically its where our data is stored inside a table
- Indexing: Indexing is a data structure technique to promote efficient retrieval of the data stored in our database.
- Data Integrity & Consistency: Data integrity refers to the quality of the information stored in our database and consistency refers to the correctness of the data stored.
- Data Models: Data models provide us with visual modeling techniques to visualize the data & the relationship that exists among those data. Ex: model, Network Model, Object Oriented Model, Hierarchical model, etc.
- Functional Dependency: Functional Dependency is a relationship between two attributes of the table that represents that the value of one attribute can be determined by another. Ex: {A -> B}, A & B are two attributes and attribute A can uniquely determine the value of B.
- Transaction: Transaction is a single logical unit of work. It signifies that some changes are made in the database. A transaction must satisfy the ACID or BASE properties (depending on the type of Database).
- Schedule: Schedule defines the sequence of transactions in which they’re executed by one or multiple users.
- Concurrency: Concurrency refers to allowing multiple transactions to operate simultaneously without interfering with one another.
Database Design Lifecycle
The database design lifecycle goes something like this:
Lifecycle of Database Design
1. Requirement Analysis
It’s very crucial to understand the requirements of our application so that you can think in productive terms. And imply appropriate integrity constraints to maintain the data integrity & consistency.
2. Logical & Physical Design
This is the actual design phase that involves various steps that are to be taken while designing a database. This phase is further divided into two stages:
- Logical Data Model Design: This phase consists of coming up with a high-level design of our database based on initially gathered requirements to structure & organize our data accordingly. A high-level overview on paper is made of the database without considering the physical level design, this phase proceeds by identifying the kind of data to be stored and what relationship will exist among those data.
Entity, Key attributes identification & what constraints are to be implemented is the core functionality of this phase. It involves techniques such as Data Modeling to visualize data, normalization to prevent redundancy, etc. - Physical Design of Data Model: This phase involves the implementation of the logical design made in the previous stage. All the relationships among data and integrity constraints are implemented to maintain consistency & generate the actual database.
3. Data Insertion and testing for various integrity Constraints
Finally, after implementing the physical design of the database, we’re ready to input the data & test our integrity. This phase involves testing our database for its integrity to see if something got left out or, if anything new to add & then integrating it with the desired application.
Logical Data Model Design
The logical data model design defines the structure of data and what relationship exists among those data. The following are the major components of the logical design:
1. Data Models: Data modeling is a visual modeling technique used to get a high-level overview of our database. Data models help us understand the needs and requirements of our database by defining the design of our database through diagrammatic representation. Ex: model, Network model, Relational Model, object-oriented data model.
Data Models
2. Entity: Entities are objects in the real world, which can have certain properties & these properties are referred to as attributes of that particular entity. There are 2 types of entities: Strong and weak entity, weak entity do not have a key attribute to identify them, their existence solely depends on one 1-specific strong entity & also have full participation in a relationship whereas strong entity does have a key attribute to uniquely identify them.
Weak entity example: Loan -> Loan will be given to a customer (which is optional) & the load will be identified by the customer_id to whom the lone is granted.
3. Relationships: How data is logically related to each other defines the relationship of that data with other entities. In simple words, the association of one entity with another is defined here.
A relationship can be further categorized into – unary, binary, and ternary relationships.
- Unary: In this, the associating entity & the associated entity both are the same. Ex: Employee Manages themselves, and students are also given the post of monitor hence here the student themselves is a monitor.
- Binary: This is a very common relationship that you will come across while designing a database.
Ex: Student is enrolled in courses, Employee is managed by different managers, One student can be taught by many professors. - Ternary: In this, we have 3 entities involved in a single relationship. Ex: an employee works on a project for a client. Note that, here we have 3 entities: Employee, Project & Client.
4. Attributes: Attributes are nothing but properties of a specific entity that define its behavior. For example, an employee can have unique_id, name, age, date of birth (DOB), salary, department, Manager, project id, etc.
5. Normalization: After all the entities are put in place and the relationship among data is defined, we need to look for loopholes or possible ambiguities that may arise as a result of CRUD operations. To prevent various Anomalies such as INSERTION, UPDATION, and DELETION Anomalies.
Data Normalization is a basic procedure defined for databases to eliminate such anomalies & prevent redundancy.
An Example of Logical Design
Logical Design Example
Physical Design
The main purpose of the physical design is to actually implement the logical design that is, show the structure of the database along with all the columns & their data types, rows, relations, relationships among data & clearly define how relations are related to each other.
Following are the steps taken in physical design
Step 1: Entities are converted into tables or relations that consist of their properties (attributes)
Step 2: Apply integrity constraints: establish foreign key, unique key, and composite key relationships among the data. And apply various constraints.
Step 3: Entity names are converted into table names, property names are translated into attribute names, and so on.
Step 4: Apply normalization & modify as per the requirements.
Step 5: Final Schemes are defined based on the entities & attributes derived in logical design.
Physical Design
Conclusion
In conclusion, a good database design is an essential part of a strong database management system (DBMS). It provides the basis for data governance, data storage, and data retrieval. The quality of a database has a direct impact on a system’s overall performance and dependability. It is important to consider data organization, standardization, performance, integrity, and more when designing a database to meet the needs of your organization and your users.
Next Article
Design of Parallel Databases | DBMS