When an RPG lookup occurs against a large array, the performance
implications can be significant if the same lookup is used
repetitively. A binary search can improve the performance of the
lookup. You should consider such an approach if your array size is
approximately 100 elements or greater and you perform a large number
of lookup operations.
If you have a 100 element array, the average number of comparisons to
find an element with the LOKUP operation is 50 (It takes 100
comparisons to determine a 'not found' condition). A binary search
can cut this to 7 comparisons. The performance difference increases
dramatically as the size of the array increases. For example, if a
500 element array exists, the average comparisons for LOKUP is 250.
A binary search will take only 9 comparisons. While there is more
overhead to perform a binary search, the major performance
consideration is the number of comparisons made.
The code provided includes a standard approach that can be copied
into any program with minor modifications.
A binary search requires that the data be in sequence and begins by
comparing against the midpoint. Each successive comparison cuts down
the total number of elements that must be considered by one-half.
When there is only one item left, the comparison can be made
directly.
The sample programs (TAARPGAC and TAARPGAR) have no merit other than
for demonstration and documentation purposes. The main intent is to
provide the subroutine (shown in TAARPGAR) and how it is linked to
and modified for your use in other programs.
Modification requirements
-------------------------
** The array must be defined to be in ascending sequence and your
data must be in sequence. In the sample program, the array
name is ARA.
** The actual number of elements that exist must be specified in
the Extension specification for the array. In the sample
program, the number is 30.
** Define the LKP array with 14 elements and 5 digits (0
decimals). This is used for a series of midpoints within the
array. Defining 14 elements allows for the sufficient
midpoints to cover the largest size RPG array (9999 elements).
** Move the search argument to LKPSRC and define LKPSRC (in the
sample program it is defined with a *DEFN operation).
** Include the LKPVAL subroutine in your program.
** Change the statement shown in the subroutine to describe the
actual number of elements in the array (approximately
statement 49). In the sample program, the number is 30.
** Change the statement shown in the subroutine for your array
name (approximately statement 68). In the sample program the
name is ARA.
** Delete the array data in the sample and insert your own array
data (approximately statement 102).
** Process the return code of Y or N in the LKPRTN field.
Note
----
If an array is specified to be in ascending or descending sequence,
but the LOKUP operation only requests an equal lookup, RPG will
search the entire array if the value does not exist. If a lookup
high or low is specified, the search will stop without searching the
entire array.
Test program
------------
The sample program has data in the range '01' - '20' and '31' - '40'.
Value '22' would be invalid. To test the program enter:
CALL TAARPGAC PARM('12')
This should produce a message with a Y (yes) response:
CALL TAARPGAC PARM('22')
This should produce a message with a N (no) response:
The parameter value can be modified to cause both valid and invalid
results.
The sample CL program calls the RPG program TAARPGAR. The RPG code
describes the binary search technique and can be used as a model to
copy into your programs.
What if the information to be searched must be changed
------------------------------------------------------
If you have an environment where the data to be searched changes
multiple times per day, the technique is not very adequate. You are
better off with a data base file and randomly accessing the
information. You could consider using the binary search technique
for the most frequently used items and handle the 'not found'
condition by checking the data base for seldom used or new items.
If the information changes on a periodic basis, there are two ways to
handle the changes:
** Keep the array data as source. Every addition or deletion
requires a change thru SEU and changing the two specifications
which describe the actual number of elements.
It is possible to have dummy entries (such as all 9's) to fill
up the array. This would allow infrequent modification of the
Extension and Calculation specs with little loss of
performance. However, you must test before starting the
search that a dummy value is not being used as the search
argument.
** Keep the array data in a file (or another source member).
Maintain the file with a separate function (e.g. SEU) and
then have a standard program which reads your normal RPG
specifications including the binary search subroutine, adds
the array data, modifies the two specifications and re-creates
the program.
The following describes a typical CL program to perform the
function. It assumes the skeleton source is in member
SKELETON and the array data to be added is in member
ARRAYDATA. The RPG program you are to create is named
BINSRCPGM:
CPYSRCF FROMFILE(QRPGSRC) TOFILE(QRPGSRC) +
FROMMBR(SKELETON) TOMBR(BINSRCPGM)
OVRDBF SOURCE TOFILE(QRPGSRC) MBR(BINSRCPGM)
OVRDBF ARRAY TOFILE(QRPGSRC) MBR(ARRAYDATA)
CALL TAARPGAR2
DLTPGM BINSRCPGM
MONMSG MSGID(CPF2105) /* Ignore not found */
CRTRPGPGM PGM(BINSRCPGM)
The TAARPGAR2 program source also exists in the QATTRPG file. The
source is not created into a program by CRTTAATOOL. The program is
intended as a model which you could use to copy into your program to
provide a specific function. The program finds the last source
statement in BINSRCPGM and starts adding the array data to the source
member. A count is kept of the array elements. When end of file is
reached for the array data, the source is updated for the Extension
spec (the number of elements in the array) and the Calculation spec.
Prerequisites
-------------
None.
Restrictions
------------
None.
Implementation
--------------
The normal solution is to include the subroutine from the TAARPGA
source by using SEU and then modify the statements as described.
Objects used by the tool
------------------------
Object Type Attribute Src member Src file
------ ----- --------- ---------- -----------
TAARPGAC *PGM CLP TAARPGAC QATTCL
TAARPGAR *PGM RPG TAARPGAR QATTRPG
TAARPGAR2 *PGM RPG TAARPGAR2 QATTRPG
|