CS 325 - Week 5 Lecture 1 - 2016-09-20
* quick review of natural join/equi-join
* equi-join: combination of Cartesian prodoct
and selection,
selecting tuples from the Cartesian product
where a column in the 1st joined table has
the same value as a column in the 2nd joined table
* equi - equal! a join based on two things
being equal
* assumes there is a column in each joined
table that shares a domain (has the same
meaning)
* natural join: an equi-join with an additional
projection, projecting all but one of the
two columns that were joined upon;
* (there are a number of different kinds of
joins -- we are most interested in
equi-join and natural joins here,
to the point that if we say "join"
we mean one of these --
for the others, I will give their
"full" names...)
* several points to remember:
* IMPORTANT: the attributes being joined upon
in these joins SHOULD have a common domain!!
(or results won't make semantic "sense")
* don't worry, under the hood the actual algorithms
for these joins are more efficient than implied
above...
* CAN join on other conditions than equality,
BUT we won't be going there...
* that's a theta join,
when the join condition is not based on equals
* there's another type of join, OUTER JOIN, that
we *might* get around to --
in an outer join, the "unmatched" pairs are
included -- students with no enrollments are
included, enrollments involving no student (?!)
are included --
and the values for the "unmatched" columns are left
NULL (so result is not always a "true" relation, I think...)
* there are other types of joins as well...!
* we'll come back to the other set-theoretic operations --
union, intersection, and difference -- later on;
========================
* INTRO to ENTITY-RELATIONSHIP MODELING,
part 1!
* also called E/R modeling or E-R modeling...!
* entering the world of DATA MODELING!
* the process of creating an REPRESENTATION
of the USERS' VIEW of the data in their
scenario
* Kroenke says it's the MOST important task
in the development of effective database
APPLICATIONS;
* we are discussing a classic approach for expressing
a user's data model, in the form of an
ENTITY-RELATIONSHIP MODEL
(the result of entity-relationship modeling is
an entity-relationship model)
* *****************************************
VERY IMPORTANT: at THIS phase of a the
database life cycle,
YOU SHOULD NOT BE THINKING IN TERMS OF TABLES
(or relations!) YET!!!!!!!!!!!!!!!!
*****************************************
don't worry, there are GOOD approaches for
then TURNING an E/R model INTO a
set of appropriate relations FOR that
scenario being modeled;
* AVOID jumping to the table-design phase
too early!
* DURING the modeling phase,
WE WILL BE TALKING about ENTITY CLASSES
and ENTITIES, not tables/relations and
tuples/rows;
(we'll see that an entity class in an E/R
model will eventually turn into ONE OR MORE
relations in the eventual design;)
AND: WE WILL BE TALKING about RELATIONSHIPS
between entity classes and entities,
NOT foreign keys in tables/relations;
...there are NO foreign keys in the E/R models
we'll be coming up with;
* when we are happy with our E/R model for a scenario,
we will CONVERT it into a DATABASE SCHEMA or DATABASE DESIGN:
* mathematical definition of db schema or db design:
* relation structures plus constraints on allowable
data values
* slightly-more-pragmatic definition of db scheme or db design:
* (Kroenke's)
* its relation structures,
* relationships, <-- foreign keys
* domains,
* and business rules
* business rule: is a day-to-day rule of operation
WITHIN the scenario
In a library: a patron can only have 10 books
checked out at a time
In a hair-cutting business: all stylists/barbers
must provide
at least one working phone number for
reaching them
^ see how that can also be a desired constraint
on the allowble data in an eventual database?
CAN'T always enforce these in eventual
database -- but we do what we can...
* note that E/R models are going to be
DBMS-independent -- doesn't matter what DBMS you
will eventually use;
(we'll even see that eventual db designs/schemas
are often RDBMS-independent, able to be
implemented on any RDBMS that provides SQL...)
* There ARE variations in expressing both E/R models
and database schemas/designs --
I'll be letting you know which variations we'll be
using in this class;
==============================
* Peter Chen introduced the entity-relationship model in a 1976
paper;
* commonly abbreviate e-r or e/r or E/R or e-r
(I'll accept all of these!)
* the diagrams based on the E/R model are called
entity-relationship diagrams, or ERDs
* (beware, there are variations in ERD formatting,
you are required to use our class variation so
I can read it!)
* the elements in this model:
* KEY elements are entities (and entity classes),
relationships,
identifiers, and
attributes!
* ENTITIES and ENTITY CLASSES
* an entity is an "object" that exists and is
distinguishable from other "objects"
... it is something significant in the
users' scenario, that they may have some
interest in keeping track of;
* an entity CLASS is a set of entities of the
same "type" (same category?)
* for a credit union,
an entity might be the savings account #12467,
and an entity class might be the set of
all savings accoounts at that credit union;
for a university,
an entity might be the student Ann Carroll,
and an entity class might be the set of all
currently-enrolled students;
* in an E/R model,
we'll be teasing out the significant entity
classes in a scenario;
* in an ERD,
each entity class will be represented as (some sort of)
a rectangle