IBM i > DEVELOPER > GENERAL

Playing the Numbers Part 2 – Converting Character to Numeric


As we promised in our part 1 of “Playing the Numbers" , this month we're covering the topic of converting character strings to numerics. Since we started last time with a basic data structure (DS) approach we'll begin there again.

Using a Data Structure

Data Structures can handle rudimentary conversion situations. Take a look at the code below.

     d charNum         ds
     d  integer                       7s 0
     d  decimal                       7s 2 Overlay(integer)

     d numValue        s              7a   Inz('1234567')
     d shortNum        s              7a   Inz('1234')

        charNum = numValue;  
        // integer = 1234567, decimal = 12345.67

This type of conversion works in simple cases with same-length values. We can even accommodate situations where we must treat the value as having decimal places. This can be done by defining the field as shown for field decimal. Note that it is not necessary to divide the integer value by 100 to achieve the decimal placement, despite the number of examples we’ve seen where it was done that way! The value extracted from decimal will be 12345.67 as desired because decimal places are only implied - they don't exist.

There are of course problems with the DS approach. Perhaps the most obvious one is that If you need to convert a character field that is shorter than the length of the charNum DS, then you can't simply assign the field to the DS. Instead you must use EVALR so that the string will be right adjusted within the DS.

In a zoned field (type S) such as integer, leading spaces are treated as zeros. So, as long as we right align the string, this works. You would probably want to also add %TRIMR to the mix to ensure that there are no trailing spaces present because, unlike leading spaces, these will not be treated as zeros and will cause a decimal data error to occur.

Using the data definitions above, the conversion could be achieved by code like this:

       evalR charNum = %TrimR(shortNum);         

        // integer = 0001234, decimal = 00012.34

Another more common problem is that, in character representations of numeric data, decimal points are normal. Attempting to use a DS approach in this situation would have to involve playing around with scanning for the decimal point and substrings etc. Luckily there is a better way!

Enter %DEC

Just as the %Char and %EditC Built-In-Functions (BIFs) assist in converting numerics to their character representations, so %Dec is one of a family of BIFs that can perform character to numeric conversions. We'll focus on %Dec as it’s the most commonly used of the group. We will just briefly note that %Dec can also be used to convert a date field to its numeric equivalent, but we will not be detailing that aspect here.

The basic syntax for string to numeric conversions is:
%Dec ( character expression : length : decimals )

 

length specifies the total length of the converted number and decimals the number of decimal places; the same values you would use if you were defining the field in the D-specs or data declarations.

So how does %Dec help with the decimal point situation? By recognizing the decimal point in the source string and aligning the result accordingly. Similarly %Dec will also recognize leading or trailing signs ( + or - ) and set the value appropriately, as shown in the example below

     d data            s              7A   Inz('-123.45')
     d number          s              7s 2

        number = %Dec(data: length: decimals);
         // number = -0000123.45 

As useful as %Dec is, there are still common features in numeric representations where it needs a little "help". For example, it is not uncommon for such values to include leading currency symbols, and embedded thousands separators. They may also include leading "check protection" characters such as asterisks. For example the string you are attempting to convert might look like this:

$12,345.67 or even this ****12345.67-

While %Dec does not handle these situations directly it does incorporate a feature that allows them to be simply resolved. It ignores all embedded space characters! In other words if you attempted to convert the strings "1 2 3 4 5" and "12345" using %Dec, the results would be identical.

This means that we can use a simple %Xlate BIF to replace all unwanted characters with spaces, like this:

        data = %Xlate( ',$*': '   ': data );
          // If data was "$*1.23 ", it is now "  1.23 "  

        number = %Dec(data: length: decimals);
          // number = 0000001.23

The first parameter to %Xlate is the list of characters to be translated. In the example, we chose to translate comma, dollar sign and asterisk. The second parameter identifies the matching replacement characters, which are spaces in our example. This will cause commas, dollar signs, and asterisks to be replaced by blanks.

It is important to make sure to have as many blanks in the replacement string as you have characters to replace. %Xlate does not extend the substitution character string with blanks; it will ignore any characters in the substitution string that do not have a matching replacement character.

The rules followed by %Dec are:

  • A leading or trailing sign can be present in the character string and can be either '+' or '-'. If you expect symbols such as DB or CR then they should be translated to the appropriate sign before converting.
  • The decimal point is optional. Beware though - both period and comma are accepted as decimal points. That is why our example above removed the comma. In Europe and elsewhere you would replace the period and use the comma as the decimal separator.
  • Blanks can be present anywhere in the field.
  • If there are more decimal digits in the source string than %Dec specifies, then excess digits are simply dropped. If there are too few then zeros fill out the length as you would expect.
  • If the converted value exceeds the specified length then an error is signalled with status code 103. Note that this has nothing to do with the length of the receiving field - it is entirely governed by the length you specify to %Dec.
  • If in spite of your %Xlate edits invalid numeric data is found, an error occurs with status code 105.

As you can see there are two potential errors that can occur and as good programmers we should always anticipate them. In our opinion absolutely the best way to do so is to use the MONITOR op-code to trap such errors and report them and/or apply substitution values where appropriate. So the complete sequence might look something like this:

Dsply ( 'Original Input was ' + data );

data = %Xlate( ',$*': '   ': data );

Dsply ( 'Scrubbed input is ' + data );

Monitor;
  number = %Dec( data: length: decimals );
  Dsply ( '%Dec value is ' + %Char( number ) );
  On-error;
   Dsply ( 'Not valid as ' + %Char(length) + ',' 
         + %Char(decimals )
         + ' - Status: ' + %Char( %Status ) );
EndMon;

In this logic we have not differentiated between the two types of error - simply reported them. Any other error to occur within the MONITOR block would also trigger the error to be flagged. Your own logic requirements will govern how you choose to handle the situation. If you are not familiar with the use of the MONITOR op-code you can read more about it in the article "RPG MONITOR is a Flexible Facility"

One more point we should make about %Dec. In our example the values length and decimals are constants defined to match the correct sizes for the target variable number. After all, there is not much point in doing the conversion if the result won't fit where it is needed!

These are the definitions we used:

d length          c                   %len(number)
d decimals        c                   %decpos(number)

By doing it this way, we automatically accommodate any changes to the definition of number with zero effort.

Other Conversion BIFs

The first one we should mention is %DEC's sibling %DECH. The only difference here is that rather than simply truncate unwanted decimal places, %DecH will use them to round the result where appropriate. The only other conversion BIF that most people are likely to use is %INT and its cousin %UNS for signed and unsigned integers respectively, both of these also have half-adjust siblings.

All of these BIFs follow the same basic rules as we outlined earlier for %DEC.

There is one other member of the "family", namely %FLOAT. This is used when converting floating point values: not something that most RPGers do on a regular basis. In addition to handling the decimal point in the same way as the other members of the family do, %Float is unique in that it is the only one that can handle character strings in exponent notation e.g. "1.2E6". Again not something you are likely to encounter very often but when you do ...

Try it for yourself

As you can see, using the BIFs makes for more straightforward and more flexible logic. If you haven’t worked with them much before, download our example code for both parts of this series on “Playing with Numbers” and try it for yourself. The sample programs can be found here: www.partner400.com/examples/NumCharConversion.zip

Jon Paris is a technical editor with IBM Systems Magazine and co-owner of Partner400.

Susan Gantner is a technical editor with IBM Systems Magazine and co-owner of Partner400.



Like what you just read? To receive technical tips and articles directly in your inbox twice per month, sign up for the EXTRA e-newsletter here.


comments powered by Disqus

Advertisement

Advertisement

2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

Are You Multilingual?

Rational enables development in multiplatform environments

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters