CS 325 - Week 14 Lecture 1 - 2016-11-29

*   a few more locking-related comments:
    *   we know locking helps to achieve ISOLATION

    *   it can be shown that locking alone does not
        guarantee serializability

    *   ONE way to also achieve serializability with
        locking is to ADD a PROTOCOL,
	for example:

	TWO-PHASED LOCKING
        *   with this protocol/strategy,
	    a transaction obtains locks as desired,
	    EXCEPT!!! once it RELEASES any lock,
	    it cannot abtain ANY MORE

	    (and once you enforce this,
	    it results in a transaction having a
	    "growing phase", in which locks are obtained,
	    and a "shrinking phase", in which locks are released)

            ^ it can be shown that this can guarantee
	      you serializability;

        *   a more-restrictive BUT easier-to-implement
	    VARIATION/version of two-phased locking
	    (used by IBM's DB2, for example)
	    is to simply NOT release any of
	    a transaction's obtained locks until
	    a COMMIT or ROLLBACK is issued; ...!

*   remember (from our earlier discussion)
    that another potential issue when using
    locking approaches is DEADLOCK
    *   you have a cycle that is a deadly embrace,
        one or more transactions waiting for something
	the other(s) have, such that they are "stuck"

    *   there are several approaches/strategies for
        dealing with deadlock...
	*   timeouts - after a transaction has waited
	    a system-specified amount of time,
	    it is aborted and restarted in case a
	    deadlock MIGHT have occurred

        *   detection - allow deadlock to occur,
	    detect it (look for a cycle in the resource
	    graph, for example), and break the deadlock
	    if detected
	    
        *   prevention - preventing deadlock from
            occurring at all
	    *   a transaction requesting a lock is 
	        immediately aborted if there is a  
		POSSIBILITY that deadlock might occur
		
		...seek to avoid the conditions that lead
		to deadlock;

        *   avoidance - for example, require a transaction
	    to lock ALL of the resources it needs 
	    at once, before allowing it to execute at all

        *   and sometimes, NONE of the above...
	    (just let deadlock occur, and wait for some
	    human intervention to break the deadlock...!)

*   not all concurrency control approaches are lock-based;
    as an EXAMPLE of another way,
    consider this version of TIMESTAMPING

    *   timestamping is another class/category of
        algorithms for concurrency control;

    *   here's one example: 
        *   when a transaction is started, it is assigned
	    a time stamp

        *   time stamp: global, unique for each transaction,
            and is monotonic (ever-INCREASING)

        *   all db operations of a transaction are considered
	    as having the time stamp of that transaction

        *   the DBMS then ensures that any *CONFLICTING*
	    operations are executed in time stamp order -
	    this will ensure SERIALIZABILITY

            *   and IF a transaction is about to do a conflicting
	        operation in NON-timestamp order,
		the DBMS aborts it, rolls it back,
		and restarts it (giving it a new, bigger timestamp)

    *   consider: with each data item Q,
        you could associate two time stamps:

	*   W-timestamp(Q) - the largest time stamp of any
	        transaction SUCCESSFULLY writing Q
	*   R-timestamp(Q) - the largest time stamp of any
	      transaction SUCCESSFULLY reading Q


        *   say, then, that Ti wants to read(Q).
	    *   say that TS(Ti) is the time stamp of Ti

            *   if TS(Ti) < W-timestamp(Q),
	        that's a problem! Ti "should" (based on timestamp
		    ordering) have read Q before this write --

		    SO, the read is rejected,
		    abort Ti and restart it with a new time stamp;

           *   if TS(Ti) >= W-timestamp(Q),
	       that's fine!
	            read WILL be executed,
		    R-timestamp(Q) will be set to max(R-timestamp(Q), TS(Ti))


        *   say, then, that Ti wants to write(Q).
	    *   say that TS(Ti) is the time stamp of Ti

            *   if TS(Ti) < R-timestamp(Q),
	        that's a problem! Ti "should" (based on timestamp
		    ordering) have written Q before the earlier read --

		    SO, the write is rejected,
		    abort Ti and restart it with a new time stamp;

            *   if TS(Ti) < W-timestamp(Q),
	        that's a problem! Ti "should" (based on timestamp
		    ordering) have written Q before the earlier write --

		    SO, the write is rejected,
		    abort Ti and restart it with a new time stamp;

           *   if TS(Ti) >= W-timestamp(Q),
	       that's fine!
	            write WILL be executed,
		    W-timestamp(Q) will be set to TS(Ti)