IBM i > TRENDS > iTALK WITH TUOHY

Barbara Morris Talks About RPG Enhancements and DATA-INTO

DATA-INTO


Paul Tuohy: Hi everybody and welcome to another iTalk with Tuohy. Delighted to have with me today the chief head honcho architect for RPG, Barbara Morris. Hello, Barbara.

Barbara Morris: Hi Paul.

Paul: So exciting times for RPG programmers at the moment, Barbara. We've just had the latest Technology Refresh out there―that came out. I think the biggest thing in the Technology Refresh is an enhancement to RPG. So without further ado, would you care to tell us about it?

Barbara: All right. There's―it's a new op code. It's called DATA-INTO and as always, new op codes in RPG has got a dash, so it's actually DATA dash INTO. If you're familiar with XML-INTO, it's very, very similar in a lot of ways. Instead of the %XML built in function, it's %data, so you basically―it's exactly―it starts out exactly the same as XML-INTO. You code the name of the variable or percent handler if you want, and then you use %data to identify where your document is. I should just back up a bit and tell for those who are not familiar with XML-INTO, let me just back up a bit and say that DATA-INTO is used to import a document of some kind―say JSON or CSV or a properties file or anything that has hierarchical data―into an RPG data structure. That's what XML-INTO does. XML-INTO takes the data from an XML document and puts it into a data structure. So that's how XML-INTO works and that's how DATA-INTO works. The difference between them, the fundamental difference, is while XML-INTO only works with XML data, DATA-INTO can work with any data that exists now and on into the future. Most people are most interested in JSON, but it will work with whatever is the next big thing after JSON, and it will work with things that are current now. So the other fundamental difference between XML-INTO and DATA-INTO is that you have to provide the parser, so like I said, it supports any kind of data. The reason it supports any kind of data is because it doesn't know anything about the data you are working with, so there is a %parser built in function where you identify the name of the parser. It can be the name of a program or the name of a procedure or even % p ADDER if it is a procedure that you are bound to. Then you can actually pass a parameter to the parser if you like. It's kind of cool the way you can pass a parameter. You can either pass a character string, in which case the parser will get a null terminated string, or you can pass a data structure so that you're sort of free to do that whichever way you want. Of course, the parser would have to actually expect that. You have to sort of know what the parser likes. So then once you've got a parser that parses say JSON or CSV or whatever, you just code your DATA-INTO operation, you code the document that you want to parse, you tell DATA-INTO what parser to use, and then DATA-INTO goes off and it basically calls your parser, passing it the document, and then the parser does what is called callbacks, where the parser calls back into the RPG runtime to say "oh, I'm starting now. Oh, here's the name. Oh, here's the structure starting. Oh, here's a value." So basically the parser, as it finds each little piece of information, it calls back into the RPG runtime, and the RPG runtime uses that information to basically just fill up your data structure. When the parser is over, it says "I'm done now," and DATA-INTO runtime returns to your program and there's the document there, the JSON document or whatever, all nicely in your data structure. So that's basically how it works and what's super fun about this is the parsers aren't actually that―I mean, parsing may be hard in general―but the interaction with RPG is actually quite easy and it's fun. I've got to say I've had to write a few sample parsers, and it's fun. So something to look forward to for everyone.

Paul: Cool. So―and actually I'm glad you just mentioned that at the end there. So you said you had to write a few sample parsers? So I assume there will be samples of, well you know, here's a JSON parser, here's a whatever parser.

Barbara: Yeah. There's actually―we had a student actually wrote a JSON parser. He was brand new to RPG, brand new to the IBM i, and in just a few months he has managed to learn enough to write a pretty nice JSON parser. So we'll be shipping that as just sample source. There won't be an official JSON parser. You'll have to compile it and you know, we hope you'll copy it and just put it in your own code and not sort of use it where we have it, but there will be a sample for that and maybe a couple of other samples, but the other samples will be more for education to show "oh, here's how you report an error. Here's how you actually call the callback functions," sort of more focused examples.

Paul: Yeah so it's―I will say I had a slight glimpse of the DATA-INTO and you're right. It is very exciting and it does look like―does look like great fun, just from a programming point of view and to me, anything with callbacks is great fun. [Laughs]

Barbara: Once you get your mind wrapped around the whole concept of callbacks―

Paul: Yeah. Well and I think―so and again correct me if I'm wrong here, Barb. If it's something like, for example, that I've written a JSON parser, well that's it. The parser can be applied anywhere. It's not specific to a certain―you know, like a certain format of JSON or that. I don't have to do like for example a parser per document.

Barbara: Oh, no, no. I mean you could, but that would kind of silly. Your parser will just say―just like it will find the name and then report the name, then it will find the curly brace. Let's say you're talking about JSON. You'll find the curly brace and say "well that indicates a structure so okay, here's a structure." Then it will find another name and then say "oh, here's a name." Then it will find the code and then the value and say "oh, here's the value." So you don't have to know anything about those names. Presumably, the RPG programmer wrote this data structure to match the document.

Paul: Cool.

Barbara: So that's the goal of the RPG programmer, just like XML-INTO. The XML parser we have doesn't know anything about the documents. It just sort of blindly parses and the RPG programmer is responsible for actually getting the data structure right. That's the same XML―with DATA-INTO.

Paul: Yeah. So and the other thing just to touch on too because when you were talking about XML, the XML-INTO a moment ago, Barbara, you mentioned taking XML from a file, but of course it doesn't have to be in a file, it can actually be in a variable. So this is something that can be used, for example, when you're doing web services―

Barbara: True.

Paul: If you just have JSON being passed directly into you from a web request or that.

Barbara: That's correct, yup and the parser actually just gets that data.

Paul: Yeah.

Barbara: DATA-INTO―actually if you say it is in a file, you know the options for DATA-INTO are identical to the XML options―

Paul: Right.

Barbara: So you would say doc = file. The DATA-INTO will actually read the file, put it into a buffer and then pass it to the parser―

Paul: Cool.

Barbara: So basically, the parser always gets document in a string, but you can do it either way. The RPG programmer can either have it in a string or in a file.

Paul: Cool. It―yeah. It is very cool. So Barbara, because bearing in mind that there are going to examples of how to do this and all of that, where is the best place going to be for people to go out and get information on DATA-INTO, like the [IBM] Knowledge Center, developerWorks, mixture of the two?

Barbara: The Knowledge Center. Hopefully everything will be in the Knowledge Center that you need to know, and unlike normal for our new RPG op codes―normally you just look in the reference and you find the information about the op code. There will be information on DATA-INTO, but there will also be something in the Open Access documentation, where currently you just have information on basically how to write an Open Access file handler. There will be a section on how to write a DATA-INTO parser.

Paul: Okay.

Barbara: So the information on writing a parser will be separate, so the DATA-INTO documentation in the reference will be for RPG programmers to use DATA-INTO. The documentation in the Open Access section of the [IBM Information Center], the Knowledge Center, will be how to write a parser.

Paul: Cool.

Barbara: It will talk about―describe all the callbacks and stuff.

Paul: Cool. So yeah. I'm sorry. I wouldn't have though of Open Access but of course, that's exactly what this is. [Laughs]

Barbara: Yeah. We decided―it's sort of―it's sort of combination of RPG and it's very similar to Open Access in a way, right?

Paul: Yeah.

Barbara: There's a thing you write that RPG interfaces with.

Paul: Yeah. So question I have to ask you, Barbara ,and it's―I think this probably has more got to do with the whole design of this―but why didn't you just do a JSON-INTO?

Barbara: You know back in, I guess it was V5R4 when we did XML-INTO, I had actually resisted that for at least two years, saying "you know XML is the bee's knees now, but there is going to be something else along someday and we're just going to really regret having this." But I finally had to bow to pressure. We did XML-INTO ,and so now the pressure has come up again to support JSON. That's where we're getting lots and lots of pressure to support JSON. But this time especially―I think possibly because we had done Open Access―so the idea of that was in our minds, not to just keep on putting JSON-INTO and then later on to do a PQR-INTO. But that basically puts us―if we did it that way, we would always be behind. You know now if the PQR markup language comes along or somebody has their own proprietary markup language, they don't have to wait for RPG to get around to providing something for it, it's done. So so once that DATA-INTO is out, basically we don't have to worry about any new thing that comes up in the future.

Paul: Yeah. So would this kind of imply then that maybe XML-INTO is going to become a thing of the past, that you know there will be an XML―a sort of parser that somebody will write and they'll just use DATA-INTO for their XML as well?

Barbara: They might, I guess. I would say that XML-INTO is very mature now. There is a lot of sort of wrinkles. XML is quite complex, so I would say it's probably best to just stick to XML-INTO. I mean it's poss―I'm actually trying to write a little converter from like an XML parser that would use XML stacks―

Paul: Yeah.

Barbara: So I could take advantage of the test cases we have for XML-INTO to test lots of aspects of DATA-INTO and not have to―and it's pretty hard to get that working. I've had to sort of abandon some test cases because there's things that we may eventually have to make DATA-INTO handle every weird thing that's in XML, but right now XML-INTO handles any weird thing that's in XML.[Laughs]

Paul: That's an achievement―truly, truly, truly is.

Barbara: And I say any weird thing. We've―I don't know when the last time we made an enhancement for XML-INTO.

Paul: Yeah.

Barbara: Hopefully we handle every weird thing that can be in XML.

Paul: Yeah. I talked to you when you were doing the conversion. I know I've been telling you this before, but one of the funniest things―because my friend and colleague Jon Paris I know was working on the beta of DATA-INTO and every now and again, I'd hear him grumbling about not being able to get the examples to work. You were just telling me beforehand why that was.

Barbara: Yeah. There was an issue. I was really glad that he discovered that, because you know I wasn't sure. Would I have a test case for that? We would have gone out with a really bad design decision, but it was basically CSID-related. You know I can say that DATA-INTO relies heavily on XML-INTO functioning, so even though it's a separate op code, we actually use a lot of the back―the back end part of XML-INTO.

Paul: Yeah.

Barbara: So XML-INTO has this CSID parameter called CSID=best, and CSID=best means XML-INTO will decide what's the best CSID to use. Then it basically calls the parser with that and it knows what it is. Well Jon's file had CSID1252, which was not same as the CSID of his job. So in our wisdom we had―we were still using CSID=best for DATA-INTO, and so we called the parser with UCS2 data. The parser wasn't written to handle UCS2 data―and so then we also realized, well obviously we can't support CSID=best because the parser should be able to rely on―you know if they only support the job CSID. Basically a parser can either support job CSID or UCS2, and if the parser wants to only support job CSID, then they can tell their customers to code CSID=job. Otherwise, they can tell their customers to just leave it alone and it will be default, which is UCS2 so―

Paul: Cool.

Barbara: One of the most common problems we have with XML-INTO―we get reported to us is that people say it's not getting my data right. That's because they haven't coded CSID=UCS2 and they have some characters that can't translate to the job CSID. So we had to scrap CSID=best which was what was causing Jon's problem.

Paul: Cool.

Barbara: So as soon as we changed-added CSID=Job to his DATA-INTO, it just worked Jim dandy.

Paul: Yeah, that explains why all the grumbling stopped. [Laughs]

Barbara: Yeah and CSID―you know I was just thinking, I have this thing whenever I switch offices I always redraw this. The very first thing I do in my new office is I draw a big red circle and the word CSID inside it, and then a big red line through it. If they ever invent time travel, the first thing I'm going to do is go back and slap people in getting that wrong. [Laughs] I don't know, maybe 50 years ago when everyone decided that they were going to have their own code point for some characters.

Paul: So remind me Barbara when next time I see you that I'll introduce to my friend Dr. Who and he may be able to help you with that.

Barbara: Oh, yes, of course.

Paul: So Barbara, now that sort of you're starting to get the DATA-INTO behind you, is there new exciting stuff that you're working on?

Barbara: Well there's always new exciting stuff and unfortunately, I can't even whisper a word about it to you. One thing you can do is you can go―and I can't even guarantee that this is true―but you can go and look at the RFEs for RPG and have a peek at some of the ones that we've accepted, and it's possible that some of those are going to be the ones we're working on. We sometimes do things that aren't specifically RFEs but―and I'm not trying to hint at anything.

Paul: Sure.

Barbara: I'm just saying that it's always worth―if you're interested in what might be coming down the pike, just you know, see what's there―

Paul: Yeah.

Barbara: And sometimes it's not even the number of votes. We had a rather new programmer on our team, and so he got to do an RFE that only had like three votes, but we did anyway because it was easy.

Paul: Yeah.

Barbara: So anyway, yeah. I can't tell you what there is, but there's always something and even though my head is full of DATA-INTO right now, as soon as I unload that, I'll be mega excited about whatever the next thing is. [Laughs]

Paul: So Barbara, I think just to finish up with, I mean I know that you for the last few months have just been buried with your nose to the grindstone, so any chance you're going to get out and about in the near future and get back to meeting real people again?

Barbara: Yeah. One of the―what I'm looking forward to is RPG and Db2 Summit which is also coming up in about a month, right around the time, I believe, that this is sort of scheduled to ship.

Paul: Yup.

Barbara: I love seeing people there. I love talking to the students. I love talking to my, sort of, fellow presenters. Then COMMON coming up in May, which I'll be attending and again, I love COMMON. It's―I don't love it as much as Db2 and RPG Summit just because there is so many people there, so thousands of people there, but still I love talking to people at those.

Paul: Well I got to tell you: if you're trying to get on my good side, Barbara, you're succeeding. [Laughs]

Barbara: Well, you know, I would tell anyone that. Maybe even to like COMMON that you know, I get overwhelmed with the number of people.

Paul: Oh yeah. No, no. I know. I mean it's always―I mean they're two very different conferences so they are but it's―and two very different experiences, but the great thing is, as you say, is getting to meet everybody at them.

Barbara: Yup. Talk about RPG multiple.

Paul: So Barbara, I have taken up enough of your time. So listen, thanks a million for taking the time to talk to us. I'm going to look forward to seeing you in just a few weeks' time.

Barbara: Great. I look forward to it, too.

Paul: Okay. Okay everybody. That's it for this iTalk. Tune in again for the next one. Bye for now.

Paul Tuohy has specialized in application development and training on IBM midrange systems for more than 20 years.



Like what you just read? To receive technical tips and articles directly in your inbox twice per month, sign up for the EXTRA e-newsletter here.


comments powered by Disqus

Advertisement

Advertisement

2018 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

IBM i > TRENDS > iTALK WITH TUOHY

Paul Tuohy on IBM i Highlights of 2016

IBM i > TRENDS > iTALK WITH TUOHY

Paul Tuohy Recaps 2017

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters