Professional Documents
Culture Documents
4690 POS Keyed Files Characteristics and Limitations
4690 POS Keyed Files Characteristics and Limitations
Table of Contents
POS Keyed Files
Objective
Implementation
Which Keyed File Hashing Algorithm Should Be Used?
History of Hashing Algorithms on 4680/90
The Answer
How to CREATE a Keyed File with Algorithm 1 or 2
Other Keyed File Considerations
Packing Factor
Randomizing Divisor
W798 - Message from adding to a keyed file
New Hashing Algorithm for the TOF Item File
Support 'Read with Lock' and 'Write with Unlock' to allow shared read/write
access to the same file (database).
Implementation
The following include some of the assumptions of the 4690 Keyed File implementation.
They exist largely based upon the objectives as well as the type of data that is being
processed in the POS store environment.
The file size is fixed at origination or build time. It is created based upon a selected
number of records that the user feels will meet the requirements for that file for 'x'
number of years. For example, a store has 100,000 unique items today, but it expects to
grow to possibly handling as many as 500,000 unique item codes in five years. The file
will be created for 625,000 possible records. (That is room for 500,000 records with a
'packing factor' of 80%. 500K records fit in 80% of the available space. The extra space
gives room for records to fit efficiently allowing minimal accesses per record.)
This allows the file to grow without reorganization of the file; i.e. no store management
of the database is required. If a file eventually becomes 'full', the file will need to be
rebuilt for a larger size. There are utilities to 'create a direct file from a keyed file' and to
'create a keyed file from a direct file' which can aid in this process.
The limits of fixed length records and one level of key may be disadvantages to some
generic requirements. You cannot have multiple records with the same key. For example,
you cannot find all 'Smith's' in the data base using the keyed file services.
Support of 'Read with Lock' and 'Write with Unlock' allows shared read/write access to
the same file. Lock occurs only at the 'block size' level. 'Block' size is a sector, 512 bytes
in 4690. It is the size of the 'record' that the keyed file system manages. A hashed key
would point to a block. The system reads the 'block' and the record is likely to be one of
several records in that block. 4690 OS locks at the block level in order to reduce the
possibility of locking out other concurrent updates to other records in the same database.
The IBM Folding Algorithm (0) was the first algorithm offered, in 4680 OS V1R1, and it
remains the default algorithm. It was found that algorithm 0 did not achieve a good
distribution of records in the keyed file when the file exceeded 3 megabytes in size.
The XOR Rotation Hashing Algorithm (1) was eventually introduced because it was
found that it could achieve a more even distribution of records in keyed files greater than
3 megabytes in size.
Eventually it was discovered that neither of these two hashing algorithms would provide
good enough distribution of records through the file if the keys were made up of ASCII
characters. Hashing algorithm 2, the polynomial hashing algorithm, was introduced to
remedy the problem of ASCII keyes.
The Answer
Faced with these three choices, users can understandably become confused over the
question of which algorithm will give the best results in any given case. Fortunately,
there is a simple answer.
In all but one case, hashing algorithm 2, known as the polynomial hashing algorithm,
should give results as good or better than algorithms 0 or 1. Algorithm 2 is the way to go.
The one exception to this rule is with the Supermarket Application Item Movement File,
EAMIMOVE.DAT. With EAMIMOVE.DAT, you must use algorithm 0. The reason is
that the Supermarket Application does its own hashing into this file by implementing
algorithm 0 within the application.
The 4680 BASIC "CREATE" statement does not allow an algorithm to be selected. It
defaults to the Folding algorithm (0). In order to create a keyed file from an application
with either the XOR Rotation Algorithm or the Polynomial Hashing Algorithm, a User
Logical Name must have been previously defined. The new User Logical Name is the file
logical name appended with an "H". The value assigned should be "0", "1", OR "2"
depending upon the algorithm to be used.
For example, to have the Item File, whose logical name is EALITEMR, use the
polynomial algorithm when the application CREATEs the file, set the User Logical
Name, EALITEMRH equal to the value 2.
For more information, see the "4680 Store Systems Programming Guide" - 'Using the
Alternate Hashing Algorithms'.
LRECL
>254
170-254
128-169
1-127
Bytes
Bytes
Bytes
Bytes
Recommended
Packing Factor
50%
55%
65%
75%
Randomizing Divisor
The Randomizing Divisor selected by the system is typically very good. It is not
necessary to try to select a different one.
If the keyed file is more than 80% full, then this file is becoming a candidate for being
rebuilt after creating the keyed file as a larger file. Eventually this file could become so
full that the system has to search for several seconds for free space. This has the potential
to freeze access to the file system for this search period.
If the keyed file is less than 80% full, then check the hashing algorithm. If algorithm "0"
is used, then changing the algorithm to "1" or "2" will probably better suit this file. See
the notes above on "which algorithm" for guidance.
the file may be less than 3 MB and does not have ASCII keys, it will provide a better
distribution of records for a tightly packed file.
In order to change the algorithm for the GSA or SA TOF Item File, the logical name of
the internal "work file" must be changed in order to get the EALIMAGE or EAMIMAGE
file set for a new algorithm.
The EAxIMAGE file is actually created as a local WORK FILE. This work file then has
its distribution mode changed and the file is renamed to EAxIMAGE. The WORK FILE
is the file that must have a logical name in order to be created with hashing algorithm 1 or
2. This file is named "WRKIMAGE".
The logical name that should be set to the hashing value is:
WRKIMAGEH