Garbage collection.Garbage collection is the feature that keeps each database as clean as possible in Interbase. While this is not really a separated topic, it of paramount importance to understand the engine, so I detached it from the previous explanation about the multi generational architecture. If the user cannot walk the chain of versions of a record, the engine is in charge of all details. In OODBMS products, there are two tendencies: in the first, the designer is in charge of all details of management, namely, the programmer must take care of creating and deleting objects when they are no longer needed and to ensure that all objects in the database can be reached in any way from the root of the database. In the second approach, the designer only plans the logic of the application, but the management details are left to the engine, that uses rules of visibility to keep objects. This means if an object becomes unreachable from the root of the database (there are no paths to recover it), this object is marked as an orphan and at some intervals, the engine runs a garbage collector that recover space used by these orphan objects. Interbase philosophy resembles the second approach: the engine takes care of coping with the versions of each record. Each time a record changes, a new version is created. For space optimization, Interbase only keeps the differences between one version and the prior (but if the changes are so thorough, it will evaluate to store a complete record and not only the changes), so by walking internally the versions, the record's history can be rebuilt. A version of the record must be present as long as there's at least one transaction that started when that version was the official (committed) version and that transaction still is active. This is named the OAT, short for Oldest Active Transaction. The engine was designed to handle several versions of each record as the normal operations take place in the database in behalf of the clients. The OAT advances when it's committed or rolled back and the current oldest active transaction becomes the new OAT and so on. The Oldest Transaction whose number still is tracked in the database is known as the OT. This happens because when an OAT no longer is active, it becomes inactive (committed or rolled back) but doesn't disappear from the database so it is kept in a sort of transaction history until a cleanup happens in the database. Because the OT is unreachable from normal operations, the version of the records it used doesn't matter anymore, so these record versions can be compacted. If the user has no direct control of the record's version, then there must be a mechanism to get rid of unused records that are older than the OAT. This process is known as garbage collection. It happens all the time and each thread that serves a request cooperates collecting garbage from the records it touches. This has been enhanced on the upcoming version 6. You may think about sweeping as the ZAP command you invoked in Clipper and Dbase but don't take this analogy in depth, because the inner workings are different. Usually, the database gets a little refresh as each thread runs and collects garbage, but also, Interbase proceeds to compact the database when the OT number (Oldest Transaction) is a threshold value less than the OAT number. In Interbase argot, the procedure resembles the cleaning with a broom, so it's called sweeping. This threshold is referred as the sweep interval an by default is set at 20000. This means if a user does an operation that causes the OAT to be the sweep interval ahead of the OT, then some delay is expected, because the engine makes an automatic garbage collection. Also, the user is able to invoke the sweep process explicitly when needed if the performance is poor due the to high rate of operations the database has undergone. Of course, the sweeping interval is configurable in a per database setting and it can be disabled completely on one database, if required. However, disabling sweeping may cause the database to grow faster than expected and the performance may suffer. The same ill effect may be caused if transactions are not recycled in a prudent period of time. If one thread reaches the sweeping threshold, it triggers the sweeping. Normally, the user being served by that thread will stall until the sweeping finishes and the rest of the users will observe a delay in the response of the engine. This is normal. It's the cost one must pay for having a non-blocking engine. Again, IB 6 will enhance on this feature, having a low priority thread devoted to garbage collection that will cooperate with threads serving requests from clients. Therefore, the likelihood of reaching the sweep interval will be smaller and the database will contain less overhead. It's worth the time to say there are two architectures in Interbase. The old is named "Classic" and the new is named "SuperServer". Classic relies on different cooperating processes, one for each request. SuperServer uses only one process and spans as many threads as needed, again, one per request. Therefore, in the discussion exposed above, for the old architecture, you must replace "thread" by "cooperating process".
|
This page was last updated on 2000-06-28 20:01:48 |