Transactions - Gate CSE - UPSCFEVER

Transactions

Collections of operations that form a single logical unit of work are called Transactions.

A transaction is a unit of program execution that accesses and possibly updates various data items. The transaction consists of all operations executed between the begin transaction and end transaction.

Properties of Transactions : ACID properties

Atomicity : If a transaction begins to execute but fails for whatever reason, any changes to the database that the transaction may have made must be undone. This “all-or-none” property is referred to as atomicity.

Consistency : Execution of a transaction in isolation (that is, with no other transaction executing concurrently) preserves the consistency of the database.

Isolation : Database system must take special actions to ensure that transactions operate properly without interference from concurrently executing database statements.

Durability : If the system ensures correct execution of a transaction, this serves little purpose if the system subsequently crashes and, as a result, the system “forgets” about the transaction. Thus, a transaction’s actions must persist across crashes.

A Simple Transaction Model

We shall illustrate the transaction concept using a simple bank application consisting of several accounts and a set of transactions that access and update those accounts. Transactions access data using two operations:

read(X) : which transfers the data item X from the database to a variable, also called X, in a buffer in main memory belonging to the transaction that executed the read operation

write(X) : which transfers the value in the variable X in the main-memory buffer of the transaction that executed the write to the data item X in the database.

Let T_i be a transaction that transfers $50 from account A to account B. This transaction can be deﬁned as

T_i: read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B)

Consistency : The consistency requirement here is that the sum of A and B be unchanged by the execution of the transaction.

Atomicity :

Suppose that, just before the execution of transaction T_i , the values of accounts A and B are $1000 and $2000, respectively. Now during the execution of transaction T_i, a failure occurs that prevents T_i from completing its execution.

The failure happened after the write(A) operation but before the write(B)operation. So the values of accounts A and B reﬂected in the database are $950 and $2000. The system destroyed $50 as a result of this failure. In particular, we note that the sum A + B is no longer preserved.

To prevent this inconsistency, we use atomicity property where the database system maintains logs and if the transaction fails then the recovery system of the database restores the old values.

Durability : The durability property guarantees that, once a transaction completes successfully, all the updates that it carried out on the database persist, even if there is a system failure after the transaction completes execution. The recovery system of the database is responsible for ensuring durability, in addition to ensuring atomicity.

Isolation :

The database is temporarily inconsistent while the transaction to transfer funds from A to B is executing, with the deducted total written to A (sender) and the increased total yet to be written to B(receiver).

If a second concurrently running transaction reads A and B at this intermediate point and computes A+B, it will observe an inconsistent value. Furthermore, if this second transaction then performs updates on A and B based on the inconsistent values that it read, the database may be left in an inconsistent state even after both transactions have completed

Ensuring the isolation property is the responsibility of a component of the database system called the concurrency-control system.

Although serial execution is desirable for preventing inconsistency by due to performance benefits we prefer parallel execution with safeguards.

A Simple abstract transaction model.

A transaction must be in one of the following states:

Active : the initial state; the transaction stays in this state while it is executing.

Partially committed : after the ﬁnal statement has been executed.

Failed : after the discovery that normal execution can no longer proceed.

Aborted : after the transaction has been rolled back and the database has been restored to its state prior to the start of the transaction.

Committed : after successful completion.

A Simple abstract transaction model

A transaction enters the failed state after the system determines that the transaction can no longer proceed with its normal execution (for example, because of hardware or logical errors). Such a transaction must be rolled back. Then, it enters the aborted state. At this point, the system has two options:

We restart the transaction if it was aborted because of hardware errors.

We kill the transaction if it was aborted due to internal logic problems as we would have to rewrite the code.

Serializability

Any schedules that are executed concurrently should have the same effect as a schedule that could have occurred without any concurrent execution. That is, the schedule should, in some sense, be equivalent to a serial schedule. Such schedules are called serializable schedules.

In a schedule when two transactions are having operations on a single data item, We say that the operations are in conflict if they are by different transactions on the same data item, and at least one of these instructions is a write operation.

If a schedule S which has conflicting operations can be transformed into a schedule S' which doesn't have conflicting operations by a series of swaps of nonconﬂicting instructions, we say that S and S are conﬂict equivalent.

We say that a schedule S is conﬂict serializable if it is conﬂict equivalent to a serial schedule.

Determining conﬂict serializability of a schedule

Consider a schedule S. We construct a directed graph, called a precedence graph, from S whose set of vertices consists of all the transactions participating in the schedule. The set of edges consists of all edges T_i → T_j for which one of three conditions holds :

T_i executes write(Q) before T_j executes read(Q).

T_i executes read(Q) before T_j executes read(Q).

T_i executes write(Q) before T_j executes write(Q).

If the precedence graph for S has a cycle, then schedule S is not conﬂict serializable. If the graph contains no cycles, then the schedule S is conﬂict serializable.

If an edge T_i → T_j is present in the graph then in any serial schedule S' equivalent to S, T_i must appear before T_j.

Note : It is possible to have two schedules that produce the same outcome , but that are not conﬂict equivalent

Such schedules are known as "schedule equivalent".

View equivalence is a type of schedule equivalence. View serializability is not used in practice due to its high degree of computational complexity.

Recoverable Schedules and Non-Recoverable Schedules

Recoverable Schedules and Non-Recoverable Schedules

In the above schedule T7 reads a value from T6 and commits immediately. However if T6 fails and has to be rolledback we cannot undo T7.

T7 has read a value that is now redundant but we cannot undo this as T7 has already committed. This is a nonrecoverable schedule.

A recoverable schedule is one where, for each pair of transactions T_i and T_j such that T_j reads a data item previously written by T_i , the commit operation of T_i appears before the commit operation of T_j.

Cascadeless Schedules

Cascadeless Schedules

As seen above T9 reads a value written by T8 and T10 reads a value written by T9. So both T9 and T10 are dependent on T8.

Now when T8 is rolled back a significant amount of work goes into rolling back T9 and T10. This undesirable phenomenon where a single abort causes multiple dependent transactions to rollback is called "Cascading rollback".

A cascadeless schedule is one where, for each pair of transactions T_i and T_j such that T_j reads a data item previously written by T_i. The commit operation of T_i appears before read of T_j.

It is easy to verify that every cascadeless schedule is also recoverable.

Transaction Isolation Levels

Protocols required to ensure serializability may allow too little concurrency for certain applications. In these cases, weaker levels of consistency are used.

SQL provides such features for the beneﬁt of long transactions whose results do not need to be precise. If these transactions were to execute in a serializable fashion, they could interfere with other transactions, causing the others’ execution to be delayed.

The isolation levels speciﬁed by the SQL standard are as follows:

Serializable : Ensures serializable execution. However some database systems implement this isolation level in a manner that may allow nonserializable executions.

Repeatable read : allows only committed data to be read and further requires that, between two reads of a data item by a transaction, no other transaction is allowed to update it.

Read committed : allows only committed data to be read, but does not require repeatable reads.

Read uncommitted : allows uncommitted data to be read. It is the lowest isolation level allowed by SQL.

All the isolation levels above additionally disallow dirty writes, that is, they disallow writes to a data item that has already been written by another transaction that has not yet committed or aborted.

upscfever.com