Five main database properties:
ACIDS (OK, usually 4, but we are adding the S)
* Atomicity
* Consistency
* Isolation
* Durability
* Serializabilty
* atomicity -- we want to be able to require that
either all parts (in terms of database effects)
of a transacton are done, or the effect is as
if NONE are done, as if it has not even been
started
* consistency - a transaction must transform the
database from one consistent state to another
consistent state
* responsibility of BOTH the DBMS and application
developers;
* isolation - data used during the execution of a transaction
cannot be used by a *second* transaction
until the first one is completed (at least,
the effect must be as if this were the case)
* durability - indicates the permanence of the database's consistent
state; it maintains its consistency BETWEEN transactions
getting into aspects such as proper backups,
and how to recover from various failures (power
outages, hardware failures, etc.)
* serializability - you have this property when the CONCURRENT
execution of transactions is equivalent to the
case where the transactions executed SERIALLY in some
ARBITRARY order
* ...we don't want database data issues JUST because
transactions are executed in parallel;
...we want a result that COULD be one of those
from A serial execution;
* an example of one way to support durability:
transaction logs
* transaction log - keeps track of all actions of transactions
that modify the database,
to be used for database recovery if a failure returns
* there are MULTIPLE cool algorithms for this --
this is just one relatively simple approach we are
discussing here;
* periodically make a complete copy of the database
(database save)
* after each such database save,
you LOG each change made by a transaction
(logging the change before actually making it!
....see why that could MATTER in this context?)
(you are keeping of track of which transaction made
which change)
* what happens when a failure occurs?
...you need to take action such that the
modifications from ALL *committed* transactions
are still reflected in the recovered database;
* there are different options here, also --
might roll forward, or roll back --
let's focus on ROLLFORWARD here:
* restore the database to the state
of the previous database save
* now go through the transaction log,
and ONLY redo the actions of transactions
that were COMMITTED at the time of
the failure;
(DON'T redo actions from transactions
that were rolled back;
DON'T redo actions from transactions
that were not committed yet at the
time of the failure;
* ROLLBACK - you would UNDO changes made by
not-yet-committed transactions;
(IF you have a database state to
roll back FROM)
* note that, in either case, you need the transaction
log to be implemented quite robustly itself,
on non-volatile memory, mirroring it, etc.
* a few more words on database recovery management
* recovery - restores a database to a previously consistent
state
* there are different levels of backup possible;
* there's a FULL backup (as we started with
in the transaction log approach)
* there is also the idea of a DIFFERENTIAL backup;
...only copy the modifications since the
previous full backup;
* faster for the backup,
* complicates/slows down any recovery (since
you have to go back to the previous full
save, and perform all the recorded
modifications since)
* concurrency control - management interleaved or concurrent
steps of transactions and still assure
isolation and serializability;
want to avoid:
* lost updates
* uncommitted data
* inconsistent retrievals
* and more...
* there are MANY possible algorithms for concurrency
control;
here are three of the classic broad categories:
* locks
* time stamping
* optimistic methods
^ see the next reading packet!!