Database Design and the E-R Model



  • In designing a database schema, we must ensure that we avoid two major pitfalls:


    1. Redundancy :


      1. A bad design may repeat information.


      2. The biggest problem with such redundant representation of information is that the copies of a piece of information can become inconsistent if the information is updated without taking precautions to update all copies of the information.


    2. Incompleteness :


      1. A bad design may make certain aspects of the enterprise difficult or impossible to model.





The Entity-Relationship Model



  • The entity-relationship (E-R) data model was developed to facilitate database design by allowing specification of an enterprise schema that represents the overall logical structure of a database.


  • The E-R data model employs three basic concepts:


    1. Entity Sets :


    2. Relationship Sets :


    3. Attributes :





Entity Sets


  • An entity is a “thing” or “object” in the real world that is distinguishable from all other objects.


  • An entity has a set of properties, and the values for some set of properties may uniquely identify an entity.


  • Example : The value 677-89-9011 for person id would uniquely identify one particular person in the university.


  • An entity set is a set of entities of the same type that share the same properties, or attributes.


  • An entity is represented by a set of attributes. Attributes are descriptive properties possessed by each member of an entity set.


  • The designation of an attribute for an entity set expresses that the database stores similar information concerning each entity in the entity set; however, each entity may have its own value for each attribute.


  • Each entity has a value for each of its attributes. For instance, a particular instructor entity may have the value 12121 for ID, the value Shekhar for name, the value Finance for dept_name, and the value 90000 for salary.



Relationship Sets


  • Relationship is an association among several entities. For example, we can define a relationship advisor that associates instructor Katz with student Shankar. This relationship specifies that Katz is an advisor to student Shankar.


  • A relationship set is a set of relationships of the same type.


  • The association between entity sets is referred to as participation; that is, the entity sets E1, E2 , ... , En participate in relationship set R.





Attributes


  • For each attribute, there is a set of permitted values, called the domain, or value set, of that attribute.


  • Example : , the domain of attribute semester might be strings from the set {Fall, Winter, Spring, Summer}.


  • An attribute, as used in the E-R model, can be characterized by the following attribute types.


    1. Simple and composite attributes : Simple attributes cannot be divided into subparts and composite attributes can be divided into subparts.


    2. Single-valued and multivalued attributes : Some attributes like Pin Code have a single value for a particular entity but some such as phone numbers may have zero, one, or several values. This type of attribute is said to be multivalued


    3. Derived attribute : The value for this type of attribute can be derived from the values of other related attributes or entities. For instance, the instructor entity set has an attribute students advised. We can derive the value for this attribute by counting the number of student entities associated with that instructor.






Constraints



  • An E-R enterprise schema may define certain constraints to which the contents of a database must conform


  • Mapping cardinalities, or cardinality ratios, express the number of entities to which another entity can be associated via a relationship set.


  • For a binary relationship set R between entity sets A and B, the mapping cardinality must be one of the following:


    1. One-to-one : An entity in A is associated with at most one entity in B, and an entity in B is associated with at most one entity in A.


    2. One-to-many : An entity in A is associated with any number (zero or more) of entities in B. An entity in B, however, can be associated with at most one entity in A.


    3. Many-to-one : An entity in A is associated with at most one entity in B. An entity in B, however, can be associated with any number (zero or more) of entities in A.


    4. Many-to-many : An entity in A is associated with any number (zero or more) of entities in B, and an entity in B is associated with any number (zero or more) of entities in A.


mapping constraints



Participation Constraints


  • The participation of an entity set E in a relationship set R is said to be total if every entity in E participates in at least one relationship in R.


  • If only some entities in E participate in relationships in R, the participation of entity set E in relationship R is said to be partial.


  • For example, Every student entity is to be related to at least one instructor through the advisor relationship. Therefore the participation of student in the relationship set advisor is total. But an instructor need not advise any students. Hence, it is possible that only some of the instructor entities are related to the student entity set through the advisor relationship, and the participation of instructor in the advisor relationship set is therefore partial.





Keys


  • Keys help to identify relationships uniquely, and thus distinguish relationships from each other.


  • The primary key of an entity set allows us to distinguish among the various entities of the set.


  • The structure of the primary key for the relationship set depends on the mapping cardinality of the relationship set. For example,


    1. Consider the entity sets instructor and student, and the relationship set advisor. Suppose that the relationship set is many-to-many. Then the primary key of advisor consists of the union of the primary keys of instructor and student.


    2. If the relationship is many-to-one from student to instructor—that is, each student can have have at most one advisor—then the primary key of advisor is simply the primary key of student.


    3. if an instructor can advise only one student — that is, if the advisor relationship is many-to-one from instructor to student—then the primary key of advisor is simply the primary key of instructor.


    4. For one-to-one relationships either candidate key can be used as the primary key






Components of Entity-Relationship Diagrams



  • Rectangles divided into two parts represent entity sets.


  • Diamonds represent relationship sets


  • Undivided rectangles represent the attributes of a relationship set.


  • Lines link entity sets to relationship sets.


  • Dashed lines link attributes of a relationship set to the relationship set.


  • Double lines indicate total participation of an entity in a relationship set.


  • Double diamonds represent identifying relationship sets linked to weak entity sets


  • Components of Entity-Relationship Diagrams
  • Mapping Cardinality : The mapping is expressed via a directed or an undirected line. Directed line indicates one-to-one mapping and undirected line indicates many to many relationship. Example : Figure (b) indicates that an instructor may advise many students, but a student may have atmost one advisor.


  • Mapping Cardinality
  • Complex constraints :


    1. A line may have an associated minimum and maximum cardinality, shown in the form l ..h, where 'l' is the minimum and 'h' the maximum cardinality.


    2. A minimum value of 1 indicates each entity in the entity set occurs in at least one relationship in that relationship set.


    3. A maximum value of 1 indicates that the entity participates in at most one relationship, while a maximum value ∗ indicates no limit.


    4. Complex constraints
    5. The line between advisor and student has a cardinality constraint of 1..1, meaning the minimum and the maximum cardinality are both 1. That is, each student must have exactly one advisor.


    6. The limit 0..∗ on the line between advisor and instructor indicates that an instructor can have zero or more students.


    7. Thus, the relationship advisor is one-to-many from instructor to student, and further the participation of student in advisor is total, implying that a student must have an advisor.






Reduction to Relational Schemas



Reduction to Relational Schemas
  • Representation of Strong Entity Sets with Simple Attributes :


    1. Let E be a strong entity set with only simple descriptive attributes a1 , a2 , ... , an. We represent this entity by a schema called E with n distinct attributes. For the above ER diagram all the strong entity sets, except time slot, have only simple attributes. The schemas derived from these strong entity sets are:


    2. classroom (building, room number, capacity)
      department (dept_name, building, budget)
      course (course_id, title, credits)
      instructor (ID, name, salary)
      student (ID, name, tot_cred)

  • Representation of Strong Entity Sets with Complex Attributes :


    1. We handle composite attributes by creating a separate attribute for each of the component attributes; we do not create a separate attribute for the composite attribute itself.


    2. The relational schema derived from the version of entity set instructor with complex attributes, without including the multivalued attribute, is thus:


    3. instructor ( ID, first_name, middle_name, last_name,
      street_number, street_name, apt_number,
      city, state, zip_code, date_of_birth)

  • Representation of Multi-valued attributes :


    1. For a multivalued attribute M , we create a relation schema R with an attribute A that corresponds to M and attributes corresponding to the primary key of the entity set or relationship set of which M is an attribute.


    2. The entity set instructor, which includes the multivalued attribute phone number. The primary key of instructor is ID. For this multivalued attribute, we create a relation schema


    3. instructor_phone (ID, phone_number)

    4. We create a primary key of the relation schema consisting of all attributes of the schema. In the above example, the primary key consists of both attributes of the relation instructor_phone.


  • Representation of Weak Entity Sets :


    1. consider the weak entity set section in the E-R diagram. This entity set has the attributes: sec_id, semester,and year. The primary key of the course entity set, on which section depends, is course_id. Thus, we represent section by a schema with the following attributes:


    2. section (course_id, sec_id, semester, year)

    3. The primary key consists of the primary key of the entity set course , along with the discriminator of weak entity i.e.section, which is sec_id, semester , and year.We also create a foreign-key constraint on the section schema, with the attribute course_id referencing the primary key of the course schema.


  • Representation of Relationship Sets :


    1. For a relationship set 'T' having attributes {a, b, c} and with to participating entities R and S with primary keys R_ID and S_ID respectively. The relation schema of T will have attributes {R_ID, S_ID, a, b, c}. Primary key of this new relation will depend on following conditions :


      1. For a binary many-to-many relationship, the union of the primary-key attributes from the participating entity sets becomes the primary key.


      2. For a binary one-to-one relationship set, the primary key of either entity set can be chosen as the primary key.


      3. For a binary many-to-one or one-to-many relationship set, the primary key of the entity set on the “many” side of the relationship set serves as the primary key.


      4. For an n-ary relationship set without any arrows on its edges, the union of the primary key-attributes from the participating entity sets becomes the primary key.


      5. For an n-ary relationship set with an arrow on one of its edges, the primary keys of the entity sets not on the “arrow” side of the relationship set serve as the primary key for the schema.


      6. teaches (ID, course_id, sec_id, semester, year)
        takes (ID, course_id, sec_id, semester, year, grade)
        prereq (course_id, prereq_id)
        advisor (s_ID, i_ID)
        sec_course (course_id, sec_id, semester, year)
        sec_time_slot (course_id, sec_id, semester, year, time_slot_id)
        sec_class (course_id, sec_id, semester, year, building, room_number)
        inst_dept (ID, dept_name)
        stud_dept (ID, dept_name)
        course_dept (course_id, dept_name)



Redundancy of Schemas



  • The schema for the relationship set linking a weak entity set to its corresponding strong entity set is redundant and does not need to be present in a relational database design based upon an E-R diagram. Thus Sec_course is redundant in the ER diagram.






Combination of Schemas



  • Consider a many-to-one relationship set AB from entity set A to entity set B.


  • Suppose further that the participation of A in the relationship is total; that is, every entity 'a' in the entity set B must participate in the relationship AB. Then we can combine the schemas A and AB to form a single schema consisting of the union of attributes of both schemas.


  • The primary key of the combined schema is the primary key of the entity set into whose schema the relationship set schema was merged. For Example,


    1. inst_dept : The schemas instructor and department correspond to the entity sets A and B, respectively. Thus, the schema inst_dept can be combined with the instructor schema. The resulting instructor schema consists of the attributes {ID, name, dept name, salary}.