Back
Home
Up
Next

Interbase internals overview.

    Thanks to Darryl VanDorp, here's an explanation from the InterBase GOD:

-------------------------------Jim Starkey----------------------------

The basis for Interbase concurrency control is not record or page locks, but record version control. First, each transaction is assigned a serial number (transaction id) when it is started. Second, each record contains the transaction id of the transaction that created it, and optionally a pointer to a previous version of the record.

The basic idea is that the database engine wants to provide each transaction a view of the database that is consistent with the instant that that transaction was started. To do this, the engine determines the state of a record version relative to the fetching transaction. If the record was created by a transaction that committed before the transaction started, the fetching transaction should see the record. If the record was created by a transaction that rolled back, the record can be garbage collected (replaced by the older version or deleted entirely). If the record was created by a transaction that was either active at the time that the fetching transaction was started, or was started after the fetching transaction, the engine either follows the record's back point or, if no pointer, ignores the record completely.

The question revolves around the mechanism used by Interbase to maintain the absolute and relative states of transaction.

On disk, there are a set of pages called Transaction Inventory Pages that record the starts of all transactions. Each transaction may be in one of four states:

active, committed, rolled back, and in limbo (between phases of a two phase commit). A transaction's state can be changed atomically by writing the appropriate Transaction Inventory Page.

To keep track of relative transaction states, each transaction must know the states of all other transactions at the time it started. This information can be obtained by copying the transaction inventory pages into the internal transaction block. As the number of transaction in a particular database grows, however, both the size of the internal transaction block and the number of transaction inventory pages that must be read increase, which up the cost of starting a transaction.

The keep the cost of starting a transaction under control, Interbase has a concept of "interesting" and (presumably) "boring" transactions. A boring tranaction is one that is either known to have been committed or known to have be completely cleaned up. An interesting transaction is either active or may have some detrious somewhere in the database. A transaction, then, doesn't need to know that states of all transaction since the database was created, but only the transaction after the "oldest interesting" transaction.

Sweeping is the house cleaning function that transforms dead but interesting transactions into boring transaction. [If anybody is still reading, please send me mail; I'm curious.]

A database sweep is nothing more than a sequential scan of all database tables. Since the very act of visiting a dead record will result in its garbage collection, a sweep, in effect, promotes all transactions that started before the "oldest active" transactions from interesting to boring transactions, reducing the cost of starting a transaction.

A sweep in itself is not a big deal. In most databases, the sweep does very little actual garbage collection -- the regular on-going database activity keeps that under control. Sweep just confirms that all garbage collection necessary to promote a batch of transactions to boring status has been performed. Note that the incremental cost for each transaction started is two bits (four states == two bits), so delaying a sweep is no big deal either.

A sweep is at most an annoyance, and then only on platforms that can't start a thread dedicated to the sweep. The sweep doesn't block anybody else and doesn't consume any scarce resources other than the process and the disk bandwidth (only that, he says).

The problems with sweeps is generally not the sweep itself, but the fact that somebody may have just wasted 20,000 transactions. Starting a transaction is moderate expensive (a disk write, a scan of the interesting transaction pages, allocation of a medium size internal block, and an lock manager lock), but starting and committing 20,000 is extremely expensive.

If you are going to store 20,000 records, I strongly suggest that you do so under one transaction rather that 20,000 separate transactions. In fact, if I were going to use somebody else's "helper" layer, I would make damn certain that the layer wasn't doing something brain numbing dumb on my behalf.

----------------------End of Jim Starkey's input----------------------------

Now if anyone is still reading:

Below is the technical paper from Borland's Interbase Web site explaining the difference between the Oldest Interesting Transaction and the Oldest Active Transaction, and how this is important to the GFIX/SWEEP interval.

---------------------------------------------------------------------------

MOVING THE OLDEST INTERESTING AND OLDEST ACTIVE TRANSACTIONS ALONG

First let us define a transaction, what it's possible states are, the life cycle of a transaction, what exactly is the OIT and OAT, and then how they are set and moved along.

DEFINITION

A transaction is an atomic unit of work made up of one or more operations against the data in one or more databases. It can contain one or many operations that might INSERT, UPDATE, DELETE, or SELECT data. Or it might be work that changes the physical structure of the database itself. The scope of the transaction is defined by the user/programmer when they START a transaction and then end it with a COMMIT or ROLLBACK.

POSSIBLE STATES

A transaction can have one of four states; active, committed, rolled back, or limbo.

LIFE CYCLE

The life cycle of a transaction is first active, set that way by the execution of a isc_start_transaction() or an isc_start_multiple() call. Then the transaction can be either committed by a isc_commit_transaction() or an isc_commit_retaining() call, or rolled back by an isc_rollback_transaction() call.

If the commit is happening for a transaction across multiple databases then the two-phase commit protocol is invoked. This first phase sets the transaction to limbo in each of the databases then the second phase races around the network to just switch the transaction bit to committed. If it fails anywhere in the two phases then the transaction is considered in limbo and the transaction bit is left set at the limbo state.

DEFINITION OF OIT AND OAT

The Oldest Interesting Transaction (OIT) is the first transaction in a state other than committed in the database's Transaction Inventory Pages (TIP). The TIP is a set of pages that log each transaction's information (transaction number and current state) in the database since the last time the database was created or last backed up and restored.

The Oldest Active Transaction (OAT) is the first transaction marked as active in the TIP pages.

The way to find out the values of the OIT and OAT is to run gstat -h

locally against the database in question.

MOVEMENT OF OIT AND OAT

We have to refine the life cycle a bit first. To create a transaction the start transaction call will first read the header page of the database, pull off the Next Transaction number, increment it, and write the header page back to the database. It also reads the OIT value from the header page and starts reading the TIP pages from that transaction number forward up to the OAT. If the OIT is now marked as committed, then the process continues checking the transactions until it comes to the first transaction in a state other than committed and records that in the process's own transaction header block. The process then starts from the OIT and reads forward until it finds the first active transaction and records that in it's transaction header block also.

If and only if the process starts another transaction, will the information from the process's transaction header block update the information on the header page when it is read to get the next transaction number. Of course if another process has already updated the header page with newer numbers, i.e. larger, then the information will not be written.

There are only two non-committed and non-active transaction states; limbo and rolled back. The only way to change a limbo transaction to committed is for the user to run gfix on the database to resolve the limbo transaction by rolling back or committing it. The only way to change a rolled back transaction to committed is to sweep the database.

 

This page was last updated on 2000-05-26 04:28:44