Entity-Relationship Diagram (ERD)
An Entity-Relationship Diagram is a graphical representation used in Logical Data Modeling to define the structure of a Database in terms of entities, their attributes, and the relationships between them. ERDs serve as the bridge between Conceptual Data Modeling (often done with UML Class Diagrams) and the Physical Data Model that maps to an actual DBMS.
Notation Styles
Several ERD notations exist, each with different visual conventions:
- Chen Notation — The original notation by Peter Chen (1976). Uses rectangles for entities, ovals for attributes, and diamonds for relationships.
- Crow’s Foot (IE) Notation — The most widely used in industry. Uses lines with specific endpoint symbols to indicate multiplicity. Also called Information Engineering notation.
- IDEF1X — A standard notation used in government and defense projects.
- UML Class Diagram — Often used for Conceptual Data Modeling and can be adapted for logical modeling.
Core Elements
Entities
Rectangles representing a distinct object or concept in the domain (e.g., Student, Course, Enrollment). Each entity maps to a table in the Relational Model.
Attributes
Properties of an entity. Types include:
- Simple — Atomic, indivisible (e.g.,
first_name) - Composite — Can be subdivided (e.g.,
address→street,city,zip) - Multi-valued — Can have multiple values (e.g.,
phone_numbers) — resolved into a separate entity in Normalization - Derived — Computed from other attributes (e.g.,
agefromdate_of_birth) - Categorical — Finite set of values — often extracted into a Lookup Table
Keys
- Primary Key — Uniquely identifies each tuple in an entity
- Foreign Key — References the Primary Key of another entity, enforcing Referential Integrity
- Composite Key — A key consisting of two or more attributes
- Candidate Key — Any attribute or set of attributes that could serve as a Primary Key
- Alternate Key — A Candidate Key not chosen as the Primary Key
- Superkey — Any superset of a Candidate Key
- Surrogate Key — A system-generated key (e.g., auto-increment ID) with no business meaning
Relationships
Connections between entities with multiplicity constraints:
- One-to-One (1:1) — Each instance of Entity A relates to exactly one instance of Entity B
- One-to-Many (1:N) — One instance of Entity A relates to many instances of Entity B
- Many-to-Many (M:N) — Resolved via a Junction Table (also called associative entity or bridge table)
Crow’s Foot Notation Symbols
| Symbol | Meaning |
|---|---|
──||── | Exactly one (mandatory) |
──|O── | Zero or one (optional) |
──<── | Many |
──|<── | One or many (mandatory) |
──O<── | Zero or many (optional) |
The notation encodes both the lower bound (0 or 1, indicating optionality) and the upper bound (1 or many).
From Conceptual to Logical Model
The translation from a UML Class Diagram to an ERD involves:
- Classes → Entities with attributes mapped directly
- Associations → Relationships with Multiplicity preserved
- Aggregations/Compositions → Relationships with appropriate foreign keys and constraints
- Generalization hierarchies → Collapsed using one of three strategies:
- Single table with a discriminator column
- Separate table per subtype with shared Primary Key
- Separate tables with duplicated parent attributes
- Many-to-Many → Resolved with a Junction Table
- Multi-valued attributes → Extracted into separate entities
- Categorical attributes → Converted to lookup tables