Library

IBAnalyst: Tips and Tricks

Some questions that not answered in IBAnalyst Recommendations and/or Help:

1. How to rebuild indices on PRIMARY, FOREIGN or UNIQUE constraints?

A: Yes, you can't use ALTER INDEX xxx INACTIVE/ACTIVE on constraint indices. If you see deep or fragmented index on this constraint, you can use special trick (used by gbak on restore):
RDB$INDICES has RDB$INDEX_INACTIVE flag that is null or 0 if index is active (after CREATE INDEX or ALTER INDEX ACTIVE). 1 means that index is inactive (after ALTER INDEX INACTIVE). But there is also value of 3 is used to indicate inactive indices on constraints. So, you can set RDB$INDEX_INACTIVE=3 for that index, COMMIT, and than return value to 0 and commit again - index will be rebuilt.

2. I used all IBAnalyst recommendations but this won't help to speedup queries.

A: This is separate issue, where IBAnalyst can't help. Here can be 2 causes of problem:
1.    Indices have outdated statistics. You can refresh index statistics by command SET STATISTICS INDEX xxx, or in IBExpert (menu Database, Recompute selectivity of all indices), or using gidx tool (www.ibase.ru/download/gtools.zip).
2.    Queries are hard themselves, or optimizer can't optimize query.
3. In some cases you will see "fragmented tables" right after restore.
Normally Interbase (without parameter -use_all_space) reserves about 25% space on data pages for future inserts, updates or deletes (to place record versions). But, with any database page size (1, 2, 4 or 8 k) you will see ~50% fragmentation for tables, that have small record size (about ~12-20 bytes, for example, table with 2 integer fields have average record size = 12 bytes).
This is OK, consider this as some magic server number (or behavior).
So, if you have such small record tables, you can
a) ignore "fragmented" warning for that tables
b) lower "fragmented %" to 45%, for example, in IBAnalyst Options dialog.

4. Record versions for table that is must not be updated

If you see record versions on table that must not be updated (for example, table with some events log) - don't worry, these versions generated by delete.
So, you will know how many current records are in table, and how many records were deleted.
This is true only if MaxVer = 1. If it is > 1, than this table is being updated by some application. If you really sure that this table must not be updated ever, it's better to set "before update" trigger with exception to find what application make updates.

5. Blobs can cause table fragmentation.

Engine stores blobs 3 different ways:
1.    1. if blob contents will fit at the data page (enough free space), it will be stored on that data page near its record (or version).
2.    2. if blob contents will not fit at the data page, it will be stored on separate page
3.    3. if in case 2 blob does not fit on one data page, pointer page is created to point at appropriate blob pages.
Case 1 happens depending of stored blob size and database page size. For example, if you had page size 4K and blobs with average size ~5K they are stored not at data pages, but at additional blob pages.
But if you backup your database and restore it with 8K page size, blobs will fit at data page, and they will be stored with records, causing high record fragmentation.
IBAnalyst marks these tables as Pale (Records column) and hint shows estimated records for that table (based on data pages count) and real average fill value (%).
If your query reads any fields except blobs from that table, natural scan, join or aggregate will run very slow.
The only solution to avoid it: create additional table (linked 1-1 to original table) and move all blob columns that have average size less than page size to it.

In that case don't try to backup/restore with greater page size! This will cause blobs that couldn't fit at data pages with current page size, will be placed at data pages during restore with bigger page size. So, your tables with blobs will be fragmented more than earlier.
Also it is not recommended to restore with smaller page size, because it can decrease performance for indices and non-blob tables.
Also you should not try to change blob fields to varchar fields - varchar fields are always stored as a part of record, so record can have 2 or more fragments (be placed at 2 or more data pages) if it doesn't fit at data page.
p.s. IBAnalyst can report these tables "by mistake", for example, table had blob fields with data, but they were dropped from table structure. Unfortunately there is no configurable option for that warning, because we calculate it exactly from data being reported by server (statistics).

6. VerLen and RecLength relation

a) VerLen >= 90% of RecLength: versions you see in Version column are mostly record deletes. The more records deleted, the less will be RecLength (up to 0 bytes). Also VerLen can be greater than RecLen if you update your table with bigger string data, than was stored in original records.
b) VerLen <=80% of RecLength: versions are mostly record updates.
We can't differ these cases more precisely because statistics shows average record and version size for the whole table, while visible versions count for concurrent transactions may vary.

7. Why IBAnalyst name some indices as "bad" ?

Indices having selectivity value lower than 0.01 marked as "bad" in IBAnalyst (See Index view help). There are several causes to name particular index as bad::
1.    Selectivity of that index lower than 0.01. Theoretically optimizer must not use that index, but it does if no else indices exists (for where, order by or join clause, at least)
2.    Such an index causes very slow garbage collection. This problem does not exist in InterBase 7.1/7.5, and will be fixed in Firebird 2.0
3.    This index make restore process very slow, and it is being created very slow (create/alter index active). This is because record numbers chain is big for one index key.
4.    If this index is used in where clause, memory usage will depend on value being searched (bitmask size). Since record chain can be big (lot of key duplicates), memory consumption will be also big.
5.    If that index is used in "order by", and lot of duplicates mostly in lower key values (depending on index sort order), there will be lot of index page reads, that will slowdown query.
That's because IBAnalyst can't ignore such indices existence.
Worst case for index is when it have Uniques column =1, i.e. all values for indexed column are the same. These indices are listed in "Useless indices" at Summary page.
Of course, for your application such an index may be "good". For example, if records have "archive" flag in some column, and your application search by index on that column only for current, not archived data. Thus, its up to you, are we right naming that index "bad", or not.

8. What if "bad" index created by Foreign Key constraint?

Well, previous paragraph shows that it's better to drop "bad" indices (if you do not use it to search keys having less duplicates than other keys). But, if such an index is created by foreign key you can drop it only by dropping foreign key. Dropping foreign key will disable relation check constraint, which can be unacceptable.
You can replace FK by triggers, but with some restrictions. FK controls record relations using index, and index "see" all keys for all records independently from transactions state. But triggers works only in client's transaction context. So, replacing FK by triggers, you must be sure that
  • Records will not be deleted from master table, or being deleted in "snapshot table reserving" mode
  • Column, used by PK in master table will not be modified ever. You can restrict this by before update trigger.
If you will maintain these conditions, you can drop particular Foreign Key. Of course, do not create index manually on that column.

9. Why in Data version percent row there are only 12 megabytes data, but I have 140 megabytes database ?

1.    IBAnalyst here shows "pure" data volume, without count of other database structures (indices, metadata...) and page fragmentation.
2.    After restore InterBase and Firebird leaves some free space (15-25%) at data pages to make faster future updates/deletes.
3.    There are specific server behavior when it leaves data pages fragmented by ~50%, if that table record size is low, about 11-22 bytes.

10. How to improve optimizer performance in case of frequent updates

Index statistics is stored in RDB$INDICES.RDB$STATISTICS column, and is being updated in 3 ways:
1.    SET STATISTICS INDEX
2.    ALTER INDEX ACTIVE, or CREATE INDEX ...
3.    restore process (all indices are being rebuilt as well as "ALTER INDEX ACTIVE")
The optimizer uses this statistics information to prepare queries. Using statistics values optimizer can decide that index is "good enough" or "not useful" for retrieving records.
If statistics was not updated during a long time optimizer may produce a bad plan because existing statistics values does not correspond the actual state of affairs, because table data can be significantly changed (for example, quantity of records was increased in 5-10 times, or vice versa, all records were deleted).
You can replace the bad automatic query plan that with explicit PLAN for particular query, but this is not a good approach, because data can be significantly changed after the plan was developed.
Alternate (and right) way is to refresh statistics periodically by applying SET STATISTICS statement for all indices. You can schedule to run SQL script to refresh statistics using ISQL or ready to use tool gidx (Windows only).
If you have some tables with periodically reloaded different records this approach will not help. Let's consider the example:

 
  •  Table A is being loaded with data 4-5 times per day.
  • After processing of loaded data all records in table A are being deleted
In this case we can see 2 correct statistics values for indices on table A - when it is loaded with data, and when it is empty. So, statistics recomputed on loaded table will be useless when table is empty, and vice versa.
To avoid this you need to recompute statistics for indices on table A only when table is filled with data. The best is before queries on that table will be run.
Since version 1.91, IBAnalyst shows index statistics difference and allow you to recompute it at any moment. First you need to look at table record information - is that usual average record count or not. If yes, you may recompute index selectivity for sure. If not - maybe it will be better do not touch index statistics, because it can cause optimizer to produce even worse query plans.