MAINFRAME > Administrator > Performance

’Tis The Season, For Tuning


Yes, it’s that time of year again: turkeys and hams, cranberries and yams; family and friends, fights and amends. But performance and tuning? A hallmark of the holidays? The answer is a resounding yes if you support an online system for a company that specializes in gifts and specialty items. That’s because orders—and thus transaction volumes—can quadruple, quintuple or even more in November and especially December, particularly if most orders are made by phone, mail or via a website. Customer service is especially important for phone orders, because people don’t like waiting on hold or enduring a long order process; they want to make their order and move on to the many other activities that characterize the dizzying rush of the holiday season.

I work with one such company, and we’ve been struggling with a performance problem with no good solution. Around 80 percent of yearly business is done during the holiday season, so the intractable performance issue we have right now is not an enviable dilemma. Frankly, it’s about stressful as it can get, with upper management looking over our shoulder, and users needing quick response. It’s further complicated by the fact that we deal with a mainframe hardware and software service provider who is sympathetic to our plight and eager to help, but who operates under a different set of priorities, which have their own arguable merit.

The Environment

Orders, customer account/order status queries, shipment, credit, etc. via both the company’s website and telephone, and related processing, all occurs within a single-production CICS system running under z/OS on a zSeries processor (there is a test and a training version of CICS as well). No database management product, such as Database control (DBCTL) or DB2 is used; files are almost exclusively virtual storage access method (VSAM). At this time of the year, more than 1,000 customer agents are logged on concurrently via a Telnet network, and a new order usually involves between 10 and 20 transactions, depending on the complexity of the order. A variety of third-party programs are installed as well, including a performance monitor, debugging, dump reading, direct access storage device (DASD) management, tape management, scheduling and other systems packages.

The CICS applications are primarily COBOL command level with a sprinkling of assembler programs, and are standard legacy, 3270-based transactions. The vast majority of transactions are pseudo-conversational, and care is taken to avoid record lockouts and other programming techniques that can cause serialization. MQseries runs within CICS, and feeds data back to the company website. The system is generally well tuned, and while there are no doubt a few tweaks that could improve performance slightly, there are no glaring tuning deficiencies—as evidenced by the fact that our performance problem has nothing to do with CICS.

The Problem

The CICS problem isn’t happening all the time, and it’s not happening at peak periods, per se. Performance tanks at the top of every hour, and is due to a variety of system functions—especially performance monitors—that perform tasks such as recording data to disk, issuing automated commands and other housekeeping tasks. The performance monitors run at a higher priority (or service class) than CICS, as does job initiation. Consequently, CICS cannot get enough resource, response time goes up and transactions back up. Assuming a system stress and related short-on-storage condition doesn’t occur before the housekeeping jobs and other top-of-the-hour processes occur, CICS will catch back up and response times will return to acceptable levels. At that point, 100s of transactions (and users) experience response time ranging from 10d of seconds to minutes.

This is an unacceptable situation, further exacerbated by the fact that moving to a multiengine configuration would mitigate degradation to some degree by allocating housekeeping work to one processor and CICS on the other. A single CICS legacy function (command-level file control, basic mapping support, temporary storage, etc.) can, for the most part, only exploit a single engine. Thus, CICS would effectively be taking a 50 percent cut in available processor resource under a multiengine configuration. VSAM subtasking could offload a substantial amount of file processing, but the overall increase in overhead makes that an even more undesirable option. A potential solution to one performance problem just creates another, even worse one.

Additionally, at higher volumes, more performance data is collected and more housekeeping is performed. Observations show the performance monitor takes 10 percent as much overhead as CICS. That’s not much of a problem when CICS is running at 30 percent; the performance monitor takes 3 percent. But when CICS is taking 95 percent of the CPU and the performance monitor is taking 9.5 percent, it doesn’t fit, and since the service provider insists on running the performance monitor above CICS, plus these ancillary housekeeping jobs, it’s CICS performance that suffers.

The Solution

Fortunately, to this point, CICS isn’t hitting 95 percent, but it is hitting 60 to 70 percent and things will get worse. After a lot of wrangling and conference calls, everyone has had to give a bit. We needed to move many top-of-the-hour jobs a few minutes to spread out job-initiation overhead, and the vendor shut off one of their performance monitors. Transaction response-time monitoring was also turned off, because notification was performed by email; this greatly cut the top-of-the-hour simple mail-transfer protocol (SMTP) overhead. We’re also taking a hard look at de-prioritizing some of the data-handling components of the performance monitor, because although the vendor insists it can’t be done, we’ve found other organizations that have done it. Slowly, we’re making progress, and this deep into the holiday season it appears we’ll make it.

Tuning Redefined

Sometimes tuning isn’t about changing parameters, analyzing data, measuring volumes, adjusting controls or re-configuring hardware. It’s about negotiating, making a case, giving a little, getting a little and settling for a less-than-ideal solution. Ultimately, what matters is getting results and making it through a tough time with as little damage as possible. More than anything, it’s the system users who really matter, because they’re the ones who make a business a success. IT is there to provide them a service that allows them to do their job better, and when we keep that in sight, we become true professionals.

 

Jim Schesvold can be reached at jschesvold@mainframehelp.com.


comments powered by Disqus

Advertisement

Advertisement

2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

Accelerating Enterprise Application Modernization

Modernizing existing applications rather than replacing them is a time-tested approach to competitive advantage in the financial-services industry.

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
Mainframe News Sign Up Today! Past News Letters