ISOM 3260 - Database Design and
Midterm Exam Review
To access the midterm practice questions, press here.
Section 1A -
Basic Terminology in Databases
Lecture 1 - Database Fundamentals
Data vs Information
- Data is any unorganised representation of fact or object.
- Structured Data are data which can be represented by alphanumeric characters, such as
text, number, and date.
- Unstructured Data are data which are hard or outright impossible to be represented by
alphanumeric characters, such as image, audio clip,
video, and a document.
- Nowadays, both structured and unstructured data can be put inside the same database.
- Information are collected and organised data which are useful for decision-making. This is
done by putting data into context or summarising data.
- In most forms of databases, we store data rather than information. This is because -
- we can select what data to select or summarise when we write programmes. Hence, storing information is
- information depends on context, and context can change.
Range of Database Technologies
- Flat File
- Data Warehousing
Traditional (Flat File) Approach vs Database Approach
- One way to reduce information storage is to digitize the documents - write programmes for each of the three
paper documents, storing the documents required as separate files.
- Although this approach is intuitive and relatively simple to implement, there are some siginificant
- Programme-Data Dependence
- Data Duplication
- Limited Data Sharing
- Lengthy Development Time
- Excessive Programme Maintenance
- The traditional flat file approach is, therefore, is insufficient for enterprises nowadays. Instead, a
database approach is generally adopted.
- Database is an organised collection of logically related data.
- There are several advantages of adopting a database approach -
- Data Sharing
- Minimal (/Planned) Data Redundancy
- Enforcement of Stanards
- Naming conventions, Data quality standards, Uniform procedures for accessing, updating, and
- Improved Data Quality
- Programme-Data Independence
- Improved Decision Support
Costs and Risks of the Database Approach
- Requires Specialized Personnel
- Installation and Management Costs
- Organizational Conflicts
- Ownership and Responsibility of Data
Range of Database Application
|Type of Database / Application
||Typical Number of Users
||Typical Size of Database
|Enterprise Resource Planning (ERP)
||Gigabytes - Terabytes
||Terabytes - Petabytes
Multitier Client-Server Architecture
Enterprise Database Application
- Enterprise Resource Planning(ERP)
- business management system that integrates all enterprise functions (e.g., manufacturing, finance,
sales, marketing, inventory, accounting, human resources)
- Data Warehouse
- an integrated decision support system derived from various operational databases
- data that describes the properties of data.
- attribute name, description, data type, default value, validation rules, relationship with other table,
physical location of the data, etc.
- Database Management System (DBMS)
- a software which handle the definition, creation, maintenance, and control of the database.
- e.g Oracle (Fusion), IBM, Microsoft, SAP
- Repository / Data Dictionary
- a “part” of the DBMS which stores the metadata of the database.
- For each query, the DBMS would find the corresponding data in the repository, and find the physical
location of the required data.
- Applicaiton Programme
- The “front-end” which is used by the actors of a system
- User Interface
- The visual design of the applciation programme
- Database Administrator
- personnel responsible for maintaining the database
- System Developer
- personnel responsible for the development and maintanenace of the application programme
- End Users
- Actors / People who interact with the application programme and the database
System Development Methodologies /
System Development Life Cycle (SDLC)
engineering, a software development process is the process of dividing software
development work into distinct phases to improve design, product management, and project
- To understand the business requirement of the system
- To thoroughly examine the business situation to determine (both functional and non-functional)
requirements of the system
- Conceptual Database Design
- To structure all information requirements; To develop all technical and organisational requirements of
- Selection of Database Model (Relational? No-SQL?)
- Logical Database Design
- Physical Database Design
- To write programmes, build database, test and install new system, train users, and finalizes
- To monitor the operation and usefulness of the system, and to repair and enhance the system
- Problems with traditional SDLC:
- Lengthy Development Time
- High Failure Rate
- Alternatives to SDLC:
- Rapid Application Development (RAD)
- Extreme Prototyping (XP)
Section 1B - Open Source Databases
Open Source Databases
Section 2 - Conceptual Database Design with Entity-Relationship Diagram
Lecture 2 -
Entity-Relationship (ER) Diagram
Business Rules and Data Modeling
- The primary focus in conceptual database design is to document business rules about data. The
entity-relationship diagram is a useful tool to do so.
- A business rule is “a statement that defines or constrains some aspect of the business. It is intended to
assert business structure or to control or influence the behaviour of the business… rules prevent, cause, or
suggest things to happen” (GUIDE Business Rules Project, 1997).
- A business rule includes names and definitions of data (including entity types, attributes, and
relationships), and constraints on data objects.
- Business rules are often expressed in terms familar to end users. Therefore, users can define and maintain
their own rules.
- Enforcement of business rules can be automated through the use of software that can interpret the rules and
enforce them using the integrity mechanisms of database management system (Moriarty, 2000).
- Not everything can be represented in E-R diagram; we focus on data storage and maintaineance, not business
Basic Terminology of Entity-Relationship (ER) Diagram
- An entity is a person, a place, an object, an event, or a concept in the user environment
about which the organization wishes to maintain data.
- Thus, an entity has a noun name.
Entity Type vs Entity Instance
- An entity type is a collection of entities that share common properties or characteristics
- An entity instance is a single occurrence of an entity type.
Entity Type vs System Input, Output, or User
- An entity type will have many possible instances, each with a distinguishable characteristic, as well as
one or more descriptive pieces of data.
- an entity type is different from a system input, because the latter refers to the process of loading
data into specified relations (entity types) of a database;
- an entity type is different from a system output, because the latter refers to the process of retrieving
data from specified relations (entity types) of a database, and;
- an entity type is not necessarily a system user (or in ISOM 3210 terms, an actor), because a system user
may not be useful (hence descriptive) in the envrionment.
Strong vs Weak Entity Types
- A strong entity type is one that exists independently of other entity types.
- Instances of a strong entity type always have a unique characteristic (called an identifier) -
that is, an attribute or a combination of attributes that uniquely distinguish each occurrence of that
- A weak entity type is an entity type whose existence depends on some other entity type. A
weak entity type has no business meaning in an ER diagram without an entity on which it depends.
- The entity type on which the weak entity type depends is called the identifying owner.
- A weak entity type has an attribute that serves as a partial identifier. Such an identifier is
partial because without the identifier of the owner, we cannot uniquely identify instances of the weak
- The relationship between a weak entity type and its identifying owner is called an identifying
- An attribute is a property or charcteristic of an entity type (or relationship) that is of
interest to the organization.
- Thus, an attribute has a noun name.
Types of Attribute
- A composite attribute is an attribute that has meaningful component parts which are more
- A simple (or atomic) attribute is an attribute that cannot be broken down into smaller
components that are meaningful for the organization.
- A multivalued attribute is an attribute that may take on more than one value for a given
entity (or relationship) instance.
- A derived attribute is an attribute whose values canbe calculated from related attribute
values (plus possibly data not in the database, such as todays date, the current time, or a security code
provided by a system user).
- An identifier is an attribute (or a combination thereof) whose value distinguishes
individual instances of an entity type.
- No two instances of the entity type may have the same value for the identifier attribute.
- To be a candidate identifier, each entity instance must have a single value for the attribute
(not null) and the attribute must be associated with the entity instance
- A composite identifier is an identifier that consists of a composite attribute.
- Criteria for selecting identifiers:
- Choose an identifier that will not change its value over the life of each instance of the entity
- Choose an identifier such that for each instance of the entitiy, the attribute is guaranteed to have
valid values and not be null (or unknown).
- Avoid the use of intelligent identifiers, whose structure indicates classifications, locations, and
so on. Such codes are often changed as conditions change, which renders the identifier values invalid.
- Consider subtituting single-attribute surrogate identifiers for large composite identifiers.
Relationship Type vs Relationship Instance
- A relationship is an association representing an interaction amongst the instances of one
or more entity types that is of interest to the organization.
- Thus, a relationship has a verb phrase naame.
- Relationships and their characteristics (degree and cardinality) represent business rules, and usually
relationships represent the most complex busienss rules shown in an ERD.
- A relationship type is a meaningful association between (or among) entity types.
- A relationship instance is an association between (or among) entity instances, where each
relationship instance associates exactly one entity instance from each participating entity type.
- Therefore, a relationship type does not denote a mandatory relationship between entity instances.
- Attributes may be associated with a many-to-many (or one-to-one) relationship.
- It is possible that entities can be related to one another in more than one way.
- An associative entity is an entity type that associates the instances of one or more entity
types and contains attributes that are peculiar to the relationship between those entity instances.
- Sufficient and necessary conditions to convert a relationship to an asssociative entity type:
- All the relationships for the participating entity types are “many” relationships.
- The resulting associative entity type has independent meaning to end users and, perferably, can be
identified with a single-attribute identifier.
- The associative entity has one or more attributes in addition to the identifier.
- The associative enetity participates in one or more relationships independent of the entities related in
the associated relationship.
Degree of a Relationship
- A cardinality constraint specifies the number of instances of an entity that can (or must)
be associated with each instance of another entity.
- The minimum cardinality of a relationship is the minimum number of instances of an entity
that may be associated with each instance of another entity.
- The maxmium cardinality of a relationship is the maximum number of instances of an entity
that may be associated with each instance of another entity.
Section 3 - Conceptual
Database Design with Enhanced Entity-Relationship Diagram
Notations in Enhanced ERD
Supertypes and Subtypes
- A subtype is a subgrouping of the entities in an entity type that is meaningful to the
organization and that shares common attributes or relationships distinct from other subgroupings.
- A supertype is a generic entity type that has a relationship with one or more subtypes.
- Attribute inheritance is the property by which subtype entities inherit values of all
attributes and instace of all relationships of the sypertype.
- Sufficient conditions of a supertype/subtype relationship:
- There are attributes that apply to some (but not all) instances of an entity type.
- The instances of a subtype participate in a relationship unique to that subtype.
- An instance of a subtype must also be an instance of a supertype. The reverse may not be true.
Generalization and Specialization
- Generalization is the process of defining a more general entity type from a set of more
specialized entity types. It is a bottom-up process.
- Specialization is the process of defining one or more subtypes of the supertype and forming
supertype/subtype relationships. It is a top-down process.
- Completeness constraint is a type of constraint that addresses whether an instance of a
supertype must also be a member of at least one subtype.
- Total specialization rule specifies that each entity instance of the supertype must be
a member of some subtype in the relationship.
- Partial specialization rule specifies that an entity instance of the supertype is
allowed not to belong to any subtype.
- Disjointness constraint is a constraint that addresses whether an instance of a supertype
may simultaneously be a member of two (or more) subtypes.
- Disjoint rule specifies that an instance of a supertype cannot be a member of two (or
- Overlap rule specifies that an instance of a supertype may simultaneously be a member
of two (or more) subtypes.
- A subtype discriminator is an attriubte of a supertype whose values determine the target
subtype or subtypes.
- Supertype/subtype hierarchy is a hierarchical arrangement of supertypes and subtypes, where
each subtype has only one supertype.