Designing a database is a crucial step in developing any information system, such as a web application. But how do you approach it? How can you ensure that you and the client share the same vision? And how can you simplify one of the most challenging stages of development?
Entity-Relationship (ER) diagrams were created to address these challenges. An ER diagram is a platform-independent visual representation of a relational database's structure. It illustrates entities, relationships, and attributes, helping developers conceptualize the database before implementation.
Let’s explore how to create ER diagrams, understand their key concepts, and improve our database design skills while minimizing mistakes.
Levels of database design
Database design is typically structured into three levels, each serving a different purpose:
Conceptual (semantic) level: At this stage, the database structure is presented in a way that is clear and understandable for users, developers, and customers. This high-level model helps ensure that everyone involved shares a common understanding of the data and its relationships. This is far more effective than lengthy text descriptions, wouldn't you agree?
Logical level: Here, the conceptual model is transformed into a formal database structure, typically a relational model. This level remains independent of any specific Database Management System (DBMS), meaning the design can be implemented in MySQL, Oracle, PostgreSQL, or any other relational DBMS.
CASE (computer-aided software engineering) is a set of software engineering tools and techniques for software design. In many CASE tools, the conceptual and logical levels are usually merged into a single design phase.
Physical level: This is where the database model is implemented in a specific DBMS. It includes defining exact data types (e.g., a text field might be defined as
VARCHAR2in Oracle orVARCHARin MySQL); database-specific optimizations like partitioning and indexing strategies; and storage considerations.
Entity-Relationship (ER) diagrams primarily operate at the conceptual level, serving as a bridge between developers and users/customers to ensure a shared understanding before moving to implementation.
What is an ER diagram?
An Entity-Relationship (ER) diagram is a visual tool used to represent the structure of a database. It helps designers, developers, and stakeholders understand how data is organized and how different entities interact.
The main components of an ER diagram are:
Entities: Real-world objects that store data, such as Student, Employee, or Country.
Attributes: Characteristics or properties of an entity, such as name, age, or salary.
Relationships: Associations between entities that define how they are connected.
Types of keys
Before diving deeper, let's define two important types of keys:
Primary Key (PK) – It is a key that uniquely defines the entity. For each row in the table, it is different (and cannot be repeated). No two rows can have the same primary key.
Example: A national ID number uniquely identifies each citizen.
Foreign Key (FK) – A primary key from one entity that appears in another entity to establish a relationship. Unlike a primary key, it can be repeated.
Example: A group number is the primary key in a
Groupentity. AStudententity may reference this number as a foreign key, indicating which group the student belongs to.
Types of connections
In an ER diagram, entities are connected by relationships, which define how data is linked. There are three main types of relationships (or connections):
One-to-One (1-1): Each instance of one entity is related to at most one instance of another entity.
Example: A national ID uniquely identifies a citizen, and each citizen has only one national ID.
One-to-Many (1-M): One instance of an entity can be related to multiple instances of another entity, but each instance of the second entity is linked to only one instance of the first.
Example: A department can have many employees, but each employee belongs to only one department.
Many-to-Many (M-M): Multiple instances of one entity can be related to multiple instances of another entity.
Example: A student can enroll in multiple courses, and each course can have multiple students.
Mandatory and optional relationships
Relationships can also be mandatory or optional, which is represented using symbols in an ER diagram:
A circle (○) means the relationship is optional.
Example: A department may exist without any employees (i.e. 0, 1 or many employees can work in a department). Then, the Department → Employee (1-M) relationship will have a circle on the employee side, meaning employees are optional for a department.
A bar (|) means the relationship is mandatory.
Example: If every department must have at least one employee (i.e. 1 or many employees can work in a department). Then, the Department → Employee (1-M) relationship will have a bar on the employee side, indicating that a department cannot exist without employees.
Identifying and non-identifying relationships
In Entity-Relationship diagrams (ERDs), relationships define how entities are linked to each other, especially in terms of primary keys (PKs) and foreign keys (FKs).
Identifying (solid line) relationship:
An identifying relationship occurs when the primary key (PK) of one entity (main/owner) becomes part of the primary key (which is also a foreign key) of another entity (dependent).
The dependent entity cannot exist independently; it must always be linked to the main entity. So, the dependent entity is sometimes called a weak entity, and the main entity is called a strong entity.
In ERDs, this type of relationship is represented by a solid line.
Example: Order and OrderItem
Order has
order_idas its primary key.OrderItem stores both
order_idanditem_idas part of its composite primary key (order_id, item_id).An OrderItem cannot exist without an Order; it must belong to an Order.
Why is this identifying? Because
order_idis part of the primary key of OrderItem, not just a foreign key.
Non-identifying (dashed line) relationship:
A non-identifying relationship occurs when the primary key (PK) of one entity does not become part of the primary key of another entity.
Instead, the primary key (PK) of the one entity is just stored as a foreign key (FK) in the another entity, but is not part of its primary key.
Both entities can exist independently of each other.
In ERDs, this type of relationship is represented by a dashed line.
Example: Student and Group
Student has
student_idas its primary key.Group has
group_idas its primary key.A Student may be assigned to a Group, so
group_idis stored as a foreign key in Student. However, a Student can exist without being assigned to a Group.Why is this non-identifying? Because
group_idis not part of the primary key of Student; it is just a foreign key.
Feature | Identifying Relationship | Non-Identifying Relationship |
|---|---|---|
primary key (PK) | PK of main entity is part of the PK of dependent entity (and is also present as FK in there) | PK of one entity doesn't become part of PK of another entity (and is only present as FK there) |
independent existence of entities | dependent entity cannot exist without main entity | both entities can exist independently of each other |
ERD representation | solid line | dashed line |
Example | Order and OrderItem | Student and Group |
More Examples
By understanding relationship types, mandatory vs. optional connections, and identifying vs. non-identifying relationships, you can accurately model real-world scenarios in your database design.
Entity
Employeehas attributesemployee_id(primary key),name,email,department_id(foreign key). And entityDepartmenthas attributesdepartment_id(primary key),name,phone.The relationship between
DepartmentandEmployeecan be a one optional to many optional non-identifying relationship:A department can have zero, one, or many employees.
An employee may belong to zero or one department.
Entity
NationalIDCardhas attributesnational_id_number(primary key),issue_date,place_of_issue. And entityCitizenhas attributesnational_id_number(primary key and foreign key),name,date_of_birth.The relationship between
NationalIDCardandCitizenis a one mandatory to one mandatory identifying relationship:Each citizen must have exactly one National ID card, and each National ID card must belong to exactly one citizen.
The entity
Citizendepends on the entityNationalIDCard, forming an identifying relationship. Thenational_id_number(which is a primary key of entityNationalIDCard) is part of the primary key of entityCitizen, meaning an entityCitizencannot exist in this database model without an entityNationalIDCard.
Entity
Studenthas attributesstudent_id(primary key),name,age. And entityDisciplinehas attributesdiscipline_id(primary key),discipline_name.Since a student can study multiple disciplines, and each discipline can be studied by multiple students, this forms a many-to-many relationship.
Since there is no many-to-many relationship in relational databases (because it is impossible to implement at the physical level, in the source code), this relationship between entities
StudentandDisciplinewill need to be converted into two one-to-many relationships. An additional intermediate entity (also called an associative entity) will appear during the transformation. This associative entity in essence will have a composite primary key (also foreign key), which will be a combination of the primary keys of the two original entitiesStudentandDiscipline.This many-to-many relationship between the entities
StudentandDisciplinemust be converted into two one-to-many relationships using entityStudentDiscipline, which is an additional intermediate entity (associative entity).So, a student's relationship with a discipline is implemented through an intermediate entity
StudentDiscipline, in which additional attributes can be added (if needed). TheStudentDisciplineentity:has a composite primary key consisting of
student_idanddiscipline_id(both are also foreign keys).It is common practice nowadays to use a serial surrogate key (a single, auto-generated unique identifier) instead of a compound primary key in associative entities. This simplifies indexing and foreign key relationships.
A serial surrogate key is an artificially generated primary key, usually an auto-incrementing number (e.g.,
order_item_id). Unlike a compound primary key (which consists of multiple attributes), a surrogate key has no business meaning and is used purely for database management efficiency.can include additional attributes like
enrollment_date,completion_date,final_score, to store the details related to a student's discipline.is represented with rounded corners in the ER diagram, to indicate that it is a dependent (or weak) entity, since it depends on both entities
StudentandDiscipline.An associative entity is typically represented with rounded corners to indicate that it depends on both entities in a many-to-many relationship.
This transformation ensures that the relationship is properly structured for relational databases.
At the conceptual level, you can choose any implementation option.
It is important to use objects correctly and define relationships so that the database has the correct resilience, and is easy to modify and store data in.
Why use an ER model?
Properly defining entities and their relationships ensures:
A resilient database structure that is easy to modify.
Fewer issues when storing and managing data.
A clear visualization of the entire database before writing any code.
Better communication between developers, customers, and team members.
The ER-model will simplify your database design, which will allow early identification of structural issues, so that you can make changes to it at the design stage itself, if required. You do not have to make edits to the code and DBMS. This model will allow the customer to work in a simple and accessible form, since pictures are clearer than text, and will allow you to see the entire structure of the database as a whole. As a result, you can immediately identify shortcomings and misunderstandings between the developer and the customer, even before writing the first line of code. This model can be used to interact with the team, so that everyone knows what they are doing.
Conclusion
You have become familiar with the concept of ER-diagrams. We hope you understood how it can make your life and the lives of your customers a lot easier. Also, we hope you understood what entities, attributes, and relationships are.
Now you should be able to quickly pass the most difficult stage of working with a database: design stage. Designing a conceptual database template constitutes half of creating an information system.