PreviousProgrammer's Guide (v9 SP2 (9.5) revision 1) Next

Estimating File Size

Chapter contents

You can estimate the number of pages, and therefore the number of bytes required to store a file. However, when using the formulas, consider that they only approximate file size because of the way the transactional interface dynamically manipulates pages.


Note
The following discussion and the formulas for determining file size do not apply to files that use data compression, because the record length for those files depends on the number of repeating characters in each record.

While the formulas are based on the maximum storage required, they assume that only one task is updating or inserting records into the file at a time. File size increases if more than one task updates or inserts records into the file during simultaneous concurrent transactions.

The formulas also assume that no records have been deleted yet from the file. You can delete any number of records in the file, and the file remains the same size. The transactional interface does not deallocate the pages that were occupied by the deleted records. Rather, the transactional interface re-uses them as new records are inserted into the file (before allocating new pages).

If the final outcome of your calculations contains a fractional value, round the number to the next highest whole number.

To estimate the size of your file, perform the following steps:

  1. Calculate the number of data pages, using the following formula ):
  2. Number of data pages = 
    #r /
    ( (PS - FO) / PRL ) 
     

    where:

    • #r is the number of records
    • PS is the page size
    • FO is the file overhead
    • PRL is the physical record length
    • To find the physical record length, refer to Table 5-8.

      To find the file overhead, refer to Table 5-6 and Table 5-7

  3. Calculate the number of index pages for each defined key using one of the following formulas.
  4. For each key that does not allow duplicates or that allows repeating-duplicatable keys:

    Number of index pages =
    ( #r /
    ( (PS - KO) / (KL + 8) ) ) * 2 
     

    where:

    • #r is the number of records
    • PS is the page size
    • KO is the key overhead
    • KL is the key length
    • For each key that allows linked-duplicatable keys:

      Number of index pages = 
      ( #UKV / 
      ( (PS - KPO) / (KL + 12) ) ) * 2 
       

      where:

    • UKV is the unique key value
    • PS is the page size
    • KPO is the key page overhead
    • KL is the key length
    • The B-tree index structure guarantees at least 50 percent usage of the index pages. Therefore, the index page calculations multiply the minimum number of index pages required by 2 to account for the maximum size.

  5. If your file contains variable-length records, calculate the number of variable pages in the file and add that number to the sum from the preceding steps. To do so, use the following formula:
  6. Number of variable pages = 
    (AVL * #r) / (1 - (FST + (VPO/PS) ) ) 
     

    where:

    • R is the total number of records in the file
    • AVL is the average length of the variable portion of a typical record
    • #r is the number of records
    • FST is the free space threshold
    • VPO is the variable page overhead
    • PS is the page size

    • Note
      You can gain only a very rough estimate of the number of variable pages due to the difficulty in estimating the average number of records whose variable-length portion fit on the same page.
  7. To the sum obtained in the preceding steps, add the following:
  8. 1 page for each alternate collating sequence page used (if any)

    1 page for a referential integrity (RI) page if the file has RI constraints

    This new sum represents the estimated total number of logical pages that the file will contain.

  9. Calculate the number of Page Allocation Table (PAT) pages, and add that number to the estimated number of logical pages from the preceding step. (For more information about PAT pages, refer to Page Preallocation )
  10. Every file has a minimum of two PAT pages; however, to calculate the number of PAT pages in a file, use one of the following formulas:

    For pre-8.x file formats:

    Number of PAT pages = 
    ((Sum of pages in Steps 1 through 3) * 4)
    divided by
    ((Page size - 8 bytes for overhead) * 2) 
     

    For 8.x or later file formats:

    Number of PAT pages = ( (Sum of pages in Steps 1 through 3) * 6 ) divided by ( (Page size - 20 bytes for overhead) * 2 )
  11. To the sum obtained in the preceding step, add 2 pages for the FCR pages. If you are using 8.x or later file format, also add 2 pages for the DIR (directory) pages.
  12. Depending on the Pervasive PSQL version file formats, the pages sizes for FCR, DIR, and PAT pages are different from the normal pages sizes for data, key, and variable pages, which affects the file size. See the following table for the correct sizes of the special pages with regard to the normal page size you are using:
    Table 5-13 Page Sizes by File Format
    Normal
    Page Size
    File Format
    v6.x-7.x
    File Format v8.x
    File Format v9.x
    File Format v9.5
    Page
    Sizes
    # of entries
    in a page
    Page Size
    # of entries in a page
    Page Size
    # of entries in a page
    Page Size
    # of entries in a page
    512
    2048
    320
    2048
    320
    2048
    320
    N/A
    N/A
    1024
    2048
    320
    2048
    320
    2048
    320
    4096
    480
    1536
    3072
    480
    3072
    480
    3072
    480
    N/A
    N/A
    2048
    4096
    640
    4096
    640
    4096
    640
    4096
    480
    2560
    5120
    800
    5120
    800
    5120
    800
    N/A
    N/A
    3072
    6144
    960
    6144
    960
    6144
    960
    N/A
    N/A
    3584
    7168
    1120
    7168
    1120
    7168
    1120
    N/A
    N/A
    4096
    8192
    1280
    8192
    1280
    8192
    1280
    8192
    1280
    8192
    N/A
    N/A
    N/A
    N/A
    N/A
    N/A
    16384
    16000
    16384
    N/A
    N/A
    N/A
    N/A
    N/A
    N/A
    16384
    16000
  13. Finally, add the estimated size of the pool of unused pages in the file. The transactional database engine uses the pool for shadow paging. To calculate the size of the pool, use the following formula:
  14. Size of the pool of unused pages =
    (Number of keys + 1) 
     

    This formula applies if tasks execute Insert, Update, and Delete operations only outside transactions. If tasks are executing these operations inside transactions, multiply the average number of Insert, Update, and Delete operations expected in the transactions times the non-transactional figure determined by the formula. Similarly, you must further increase the estimated size of the pool of unused pages if tasks are executing simultaneous concurrent transactions.

  15. Having calculated the number of pages that the file needs, use the following formula to calculate the maximum number of bytes required to store the file:
  16. File size in bytes =
    Total file pages times Page size 
    

Chapter contents
Book contents

Prev topic: Choosing a Page Size
Next topic: Optimizing Your Database