Varying-Dimension Arrays for RPG
The new IBM i 7.4 announcement brought with it varying-dimension arrays, a new data definition keyword and more.
By Jon Paris06/01/2019
With the IBM i 7.4 announcement, Santa certainly did not disappoint and we have a terrific new feature that we have wanted for a long, long time—dynamic arrays or as IBM prefers to call them "varying-dimension arrays.” In addition, we also have a new data definition keyword, SAMEPOS, which makes it easier to redefine data items without the restrictions of OVERLAY. Last but not least, there are also two new fields available in the PSDS. We'll talk more about SAMEPOS and the PSDS enhancements later, but for now let's focus on the enhancements to arrays.
If you’re like us, whenever you set up an array you often find yourself conflicted as to whether to make the array as big as it could ever conceivably need to be, or to set a more practical limit. The first risks wasting memory and the second risks program failure. In addition, of course, we inevitably do this in the knowledge that no matter what choice we make, one day it is just not going to be big enough!
Those days are now behind us—and we can now set the maximum size of the array to a suitably massive value, secure in the knowledge that RPG will only use enough memory to hold the active content. Even better, because RPG is keeping track of the highest element in use in the array, we can use operations such as SORTA and/or %LOOKUP without having to consider using %SUBARR to restrict the operation to active elements. That alone is good reason for us to use this support.
Types of Dynamic Arrays
The actual syntax for defining these arrays is:
ArrayName DIM ( type : MaximumElements )
The MaximumElements value is the largest size that the array can ever be. Think of this as a "safety valve" to prevent a logic (or data) error from causing a program to use ridiculously high index values. So an array defined as:
Dcl-S myAutoArray Char(10) Dim ( *Auto: 1000 );
Starts life with zero active elements and would grow to 900 if the following code were executed.
myAutoArray(900) = 'Highest';
Any attempt however to access an element higher than the limit of 1,000 would cause an error, just as it would with a conventional RPG array defined as Dim(1000).
Contrast this with an array defined as:
Dcl-S myVarArray Char(10) Dim ( *Var: 1000 );
In this case, without other action on the programmer's part, even an attempt to use an index value of 1 would result in an error. This is because this type of array also starts with zero active elements, but will only grow when told to do so. So how do we tell it to grow? Through use of an old friend in a new role - the %Elem() built-in. We can now use %Elem() on the left hand side of an expression to set the current number of active elements. So if we wanted to place a value in element 900, we would first have to code:
%Elem( myVarArray ) = 900;
And then we could do:
myAutoArray(900) = 'Highest';
The value returned by *Elem() for both of these new types of arrays is also slightly different from conventional arrays. For these types of arrays It will return the current active size, as opposed to the maximum size. If at any time you do need to know the absolute maximum size for the array you can obtain this by specifying the keyword *Max. For example, if you wanted to ensure that an array index for an *Auto array would not cause an error you could code:
If index <= %Elem( myAutoArray : *Max );
*AUTO arrays also offer another nice feature. You don't actually have to keep track of the highest index used. Simply specify the index value as *Next and RPG will work it out for you. So we can code something like this:
Dcl-S myAutoArray Int(5) Dim( *Auto : 999 ); Dsply ('Start: Active elements = ' + %Char(%Elem(myAutoArray)) ); For i = 1 to 50; myAutoArray( *Next ) = i; EndFor; Dsply ('End: Active elements = ' + %Char(%Elem(myAutoArray)) ); Dsply ('Maximum capacity of array is: ' + %Char(%Elem(myAutoArray : *Max)) );
This is more in line with the way most other modern languages treat arrays and should make life a little easier for all of us, but particularly for newcomers to RPG.
(Not Quite So) Dynamic Arrays
Limitations and Other Considerations
An additional limitation is that varying dimension arrays cannot be used in old-style calc specs. This isn’t much of a limitation really since anyone who hasn't changed their coding style to free-form in the last 18 years is not likely to use anything as modern as varying dimension arrays in the first place.
This is more of a warning than a limitation really, but you need to be aware that any time you increase the current size of the array the system may need to move it in memory. As a result, if you use %Addr to obtain its address (for example to pass the array to a C-style API) you must refresh that address prior to use or risk passing the wrong data.
Similarly, since the debugger is blissfully unaware that the array size may change, you must avoid trying to access elements beyond the current size while debugging or you'll get confused. RPG provides the special variable _QRNU_VARDIM_ELEMS_arrayname which contains the current maximum and can be displayed in debug to confirm the current element count. Note however that this a copy of the true variable and so changing its value during a debug session will have zero effect.
Enhancements You Can Use Now
This keyword doesn't provide a new capability - but it does make certain types of definitions easier to implement and more obvious to those who come after. For example, back in 2003 we published an article called D-Spec Discoveries. You can read it here.
In that article we discussed techniques for redefining contiguous fields as arrays. Those techniques still work, but the new SAMEPOS keyword makes life much simpler. Assume that we have a file that contains 12 contiguous fields, one for each month of the year (JanSales, FebSales, etc.) In order to be able to treat those monthly sales figures as an array all we need to do is to code an externally described DS based on the file like so:
Dcl-Ds SalesData Ext; MonthlySales Like(JanSales) Dim(12) SamePos(JanSales); End-Ds;
Simple and clean. SAMEPOS identifies the starting position for the item but, unlike the POS keyword, or even the old from/to notation, it is soft-coded. As a result, if the layout of the data changes it will automagically adjust with the next compile—a much safer way to do it.
There are many other scenarios where this feature comes in handy. One that we have encountered recently concerns finding an effective way to handle multi-format records, such as System 36 style header/details types of file. These are not as uncommon as you might think and can be found in mainframe conversions and in some types of EDI records. Here's an example of an S/36 style file and how it can be handled without resorting to I-specs, pointers, moving data around, or any of the other techniques we used to use.
Dcl-F S36File Disk(40); // Program described input file Dcl-Ds RecordLayout Qualified; // Common fields CustNo char(5); RecordId char(1); OrderNumber zoned(5); // Layout for header Dcl-Ds Header; OrderDate date(*USA); ItemCount int(5); OrderTotal packed(7:2); End-Ds; // Layout for Detail Dcl-Ds Detail SamePos(Header); <<=== ItemCode char(7); ItemQty packed(5); ItemPrice packed(7:3); End-Ds; End-Ds;
Read S36File RecordLayout; // Load record into DS
There are many other uses for this new keyword. For example, you could group together separate day, month and year fields into a single date field. We're sure you'll find your own uses.
That completes our initial review of the new RPG features. We'll be back with a follow-on article where we'll discuss some of the more esoteric aspects of the Dynamic Array support, and in particular how to use these new-style arrays as parameters.
Until then, if you have any comments or questions please let us know via the comments section.
Jon Paris is an IBM Systems magazine, Power Systems edition technical editor.