Skip to main content

An RPGers First Steps with JSON: Consuming JSON data with YAJL

How to use the YAJL library to consume a JSON document and extract the data for use by an RPG program. Part 2 of 2.

In our previous article we focused on the use of the YAJL library (and specifically Scott Klement's RPG oriented implementation of it) to create a JSON document. This time we are going to focus on the use of YAJL to consume a JSON document and extract the data for use by an RPG program. The JSON we are going to be processing is a slightly simplified version of the Customer details document that we created in the previous article. The file looks like this:

(1)  { "Customers": [
(2)       { 
             "ID": 12345,
             "Name": "Paris",
             "Street": "Main Street",
             "City": "Jasontown",
             "State": "CA",
             "Zip": "12345"
          },
(3)      { 
             "ID": 23456,
             "Name": "Rich",
           < ... data for multiple customer entries omitted ... >
 
            "Zip": "98765"
         }
(4) ]
}

The document starts with an array named Customers. This is followed at (2) by the data for the first customer which begins with an opening "{". The individual elements (ID, Name, etc.) follow and are separated by commas. The final item for the first customer is terminated by a closing "}".

This document highlights an additional difference between XML and JSON that we didn't mention in the first part.  In XML we would be forced to name the repeating element (i.e. the customer data) in order to be able to group the related fields. JSON does not require names for everything, simply grouping items into an object as we have done here is sufficient.

That’s what the document looks like, but how do we process it?

Processing JSON with YAJL

YAJL provides two methods for processing JSON. The first is an event-driven parser that processes individual pieces of data as it encounters them. In this respect it’s somewhat similar to RPG's XML-SAX op-code.

The second method, and the one we use here, is the tree parser. This works on the document as a whole and allows you to basically reference data elements by name. You’ll see that when we look at the code.

The basic methodology that is used is to first load the "tree" with the entire JSON document, and then to obtain a node locator—an identifier or "key" if you want to think of it that way—for each level of the document you want to process. You then use that locator to drill down to the next level. There really is no RPG equivalent, but the effect is similar to if you did a serial READ of a file and then used data from that to CHAIN out to other data.

As you'll see, we'll begin by asking YAJL to supply us with the locator for the Customers element, and then use that to locate the nodes for the individual customer array elements. These will then be used to access the individual fields that make up customer data. It sounds a lot more complicated than it actually is, as you'll see when we look at the code.

As usual we begin by copying in the prototypes for the YAJL routines (A). Next come the definitions for our node locators (B). In order to make the code a little simpler to understand we have defined a separate node locator for each of the elements we will be processing. Each is defined as being like a yajl_val. These are actually pointers, but Scott wisely "masked" this to protect the pointer phobic. At (C) we define the data structure array that will hold the extracted data.

(A)   /copy yajl/qrpglesrc,yajl_h

      // Node locators
(B)    dcl-s  root           Like(yajl_val);
       dcl-s  addressNode    Like(yajl_val);
       dcl-s  customersNode  Like(yajl_val);
       dcl-s  customerNode   Like(yajl_val);
       ...

(C)    dcl-ds  customer  Dim(99)  Inz  Qualified;
         id       Zoned(5);
         name     Char(40);
         street   Char(40);
         city     Char(30);
         state    Char(2);
         zip      Char(5);
       end-ds;

Time to start the processing. Since this particular JSON document is stored in the IFS we are using the API yajl_stmf_load_tree (D) to load the document identified by the first parameter, into YAJL's document tree. If the load is successful, the API returns the node locator for the root element. If an error were encountered then a message would be placed in the second parameter ( errMsg ). Possible errors include the file being missing, or otherwise inaccessible, or syntax errors encountered when parsing the JSON data.

At (E) we test the error message to see if any problems were encountered and take appropriate action should that be the case. Next (F) we use YAJL_OBJECT_FIND to locate the Customers object. Note that we supplied a node locator as the first parameter so the API knows where to conduct the search, i.e., the "branch" of the tree in which it should be looking. Needless to say, at the start of the process this is always going to be the root node returned to us by yajl_stmf_load_tree. You might be wondering what would happen if we failed to locate the Customers object. Good question - but we'll defer the answer to later in the discussion when we discuss the handling of optional elements.

G) Demonstrates a useful YAJL function: YAJL_ARRAY_SIZE. This will return the number of elements found in the array identified by the node locator passed to it. In this case we are passing the locator to the Customers array. The code at (H) is included just to demonstrate a way to defend against any problems caused by RPG's fixed array size. In this program we allowed for 99 customers in the array. This code makes sure that we defend against any attempt to process more than that number.

(D)    root
          = yajl_stmf_load_tree ( '/Partner400/Customers.json'
                                : errMsg );

       // If an error was found then report and exit
(E)    if errMsg <> '';
         Dsply 'Ooppppssss - file load problem!';
         // Add appropriate error handling ...
         Return;
       EndIf;

(F)    customersNode = YAJL_OBJECT_FIND( root: 'Customers' );

(G)    elements = YAJL_ARRAY_SIZE( customersNode );

(H)    If elements > %Elem(customer);  // Too many to handle?
          Dsply ('Can only process ' + %Char( %Elem(customer) ) +
                 ' customers - File contains ' + %Char(elements) );
          *InLr = *On;
          Return;
       endif;

Now that we have a locator which points to the Customer array (customersNode) we can use it to process each of the individual customer elements in turn. We do this by using YAJL_ARRAY_LOOP (I) which returns an indicator which will be *On if an element is found, and *Off when there are no more elements to process. That allows us to use a DOW loop to easily iterate through all of the elements in the array. The first parameter (customersNode) is the node we wish to traverse, the second (c) is a counter that identifies the array element number to start from (more on this in a moment) and the third (customerNode) will be used to return the node locator for each array element (i.e., customer).

Two points about the counter (c): The first is that it’s automatically incremented each time the function is called. We can therefore use it as an array index when loading the extracted data into the array DS as you will see in a moment.

The second is that the function uses the counter to determine which element to retrieve. Since we want to start from the first element we had to clear the counter(H) before entering the loop.

Using the retrieved customerNode locator we can now extract the component fields. We begin (J) by using YAJL_OBJECT_FIND to locate the ID field and then extract its value (K) using YAJL_GET_NUMBER. The result is loaded directly into the corresponding element of the customer DS array using the counter as the index. This process is then repeated for each of the customer fields, starting at (L) with the name. Note that for character fields we used YAJL_GET_STRING to perform the corresponding data extraction.

That's all there is to it really.

(H)    c = 0;
      
(I)    Dow YAJL_ARRAY_LOOP( customersNode: c: customerNode );

(J)       idNode = YAJL_OBJECT_FIND( customerNode: 'ID' );
(K)       customer(c).id = YAJL_GET_NUMBER( idNode );

(L)       nameNode = YAJL_OBJECT_FIND( customerNode: 'Name' );
          customer(c).name = YAJL_GET_STRING( nameNode );

          // ... Similar operations performed for all fields ...

       EndDo;

An Alternative Approach

Requesting the value for each field in turn is fine if they are all present. But what if one were missing, either by accident or design (e.g., an optional field)? How could we tell?

There are two basic approaches we could take. The first is simply to test the node returned by YAJL_OBJECT_FIND to see if it is null. Remember that underneath it all the node values are pointers so the actual test could look like this:

      idNode = YAJL_OBJECT_FIND( customerNode: 'ID' );
         If idNode <> *null;
            customer(c).id = YAJL_GET_NUMBER( idNode );
         Else;
            // Take appropriate error action

If ID were a compulsory field, which given its name is likely, then this is probably an appropriate way to handle it. But what if there are optional fields involved in the document? Is there an easier way than coding a test for each and every field?

The answer is yes. By using the YAJL_OBJECT_LOOP function we can simply loop through all of the objects within a given node, to obtain the field name for each. We can then use that name in a SELECT operation and extract the values as appropriate.

In the example below we have replaced the code (J, K, etc.) that extracted the fields within the YAJL_ARRAY_LOOP (I) with the following logic:

(M)       i = 0;

(N)       Dow YAJL_OBJECT_LOOP( customerNode: i: key: node );
             Select;
(O)          When key = 'ID';
                customer(c).id = YAJL_GET_NUMBER( node );
             When key = 'Name';
                customer(c).name = YAJL_GET_STRING( node );
             When key = 'Street';
                customer(c).street = YAJL_GET_STRING( node );
             When key = 'City';
                customer(c).city = YAJL_GET_STRING( node );
             When key = 'State';
                customer(c).state = YAJL_GET_STRING( node );
             When key = 'Zip';
                customer(c).zip = YAJL_GET_STRING( node );
             EndSl;
          EndDo;

We begin by setting the item counter (i) to zero. It will subsequently be incremented by the YAJL_OBJECT_LOOP function at (N). This function requires four parameters. The first is the node locator from which we are to extract the objects—c customerNode in our case. The second is the aforementioned item counter. The third is a character field into which the function will place the key value (i.e. the object's name) and it will place the actual locator node in the fourth parameter. Like YAJL_ARRAY_LOOP this function returns an indicator that is used to control the DOW loop.

We can then code a SELECT and test for the individual field names in the WHEN clauses that begin at (O). In practice, when using this approach, we would almost certainly need to add logic to ensure that compulsory items were in fact present. We would also need to ensure that appropriate default values were set for the optional fields. For example by ensuring that the optional character fields in the array DS were blank.

Wrapping Up

There are a lot more features and functions available within YAJL than we have had time to cover in this brief two part series, but we hope we have whetted your appetite and given you a good starting point for further explorations.

We hope to return to YAJL in future articles. In the meantime, if there are any specific problems or issues that you would like us to address, please let us know via the comments section.

IBM Systems Webinar Icon

View upcoming and on-demand (IBM Z, IBM i, AIX, Power Systems) webinars.
Register now →