AIX > Tips & Techniques > Systems Management

Disk I/O and the Network

Increase performance with more tips for AIX 5.3, 6.1 and 7


Editor’s Note: This is the concluding article in a two-part series on AIX tuning. Part one covered paging, memory and I/O delays and this article focuses on disk I/O and the network.

There have been many technology levels (TLs) released and some of the recommendations may have changed leading up to the release of AIX 7. In this article, I’ll share additional AIX tuning information looking at tunables in AIX 5.3, 6.1 and 7, as it relates to disk I/O and the network.

One key reminder: A fresh AIX 6 or 7 install will automatically install the new defaults for memory. If the system is migrated from AIX 5.3, then any tunables set in AIX 5.3 will be migrated across. Prior to performing a migration, it’s suggested you make a note of all of the tunables that have been changed (take a copy of /etc/tunables/nextboot) and then reset the tunables to the defaults. After migration, check nextboot and make sure there’s nothing in it. Now, go ahead and set the tunables that need to be changed for AIX 6 or 7.

Disk I/O

Many of the most common performance problems are related to I/O issues. In particular, data layout can affect performance more than any I/O tunable the administrator can set. Since changing these later is extremely painful, it’s important to plan in advance to avoid these problems.

The trend in the industry right now is to provide fewer, larger hdisks to the server. For example, the server may be given one 500 GB hdisk that’s spread across several disks in the disk subsystem, rather than being given 10, 50 GB or five 100 GB hdisks. However, I/O performance depends on bandwidth, not size. While that data may be spread across multiple disks in the back end, this doesn’t help with queuing in the front end. At the server, the hdisk driver has an in-process and a wait queue. Once an I/O is built in the JFS2 buffer, it gets queued to the LUN (hdisk). Queue_depth for an hdisk (LUN) represents the number of in-flight I/Os that can be outstanding for an hdisk at any given time.

The in-process queue for the hdisk can contain up to queue-depth I/Os and the hdisk driver submits the I/Os to the adapter driver. Why is this important? If your data’s striped by LVM across five hdisks then you can have more I/Os in process at the same time. With one big hdisk, you’ll be queuing. Multipath I/O drivers such as subsystem device driver (SDD) won’t submit more than queue_depth I/Os to an hdisk, which can affect performance. You either need to increase queue_depth or disable that limit. In SDD, use the "datapath qdepth disable" command.

Some vendors do a nice job of setting the queue_depth, but if you’re using large logical-unit numbers from multiple disks in the back end, then you’ll want to grow this. You can use the iostat -D or the sar -d commands to figure this out. Interactive nmon also has a -D option, which lets you monitor sqfull as well. If you’re using sddpcm, then you can use "pcmpath query devstats" to monitor sqfull and "pcmpath query adaptstats" to monitor adapter queuing.

In particular, look at the avgsqsz, avgwqsz and sqfull fields to determine if you need to increase queue_depth. Don’t increase queue_depth beyond the disk manufacturer’s recommendations. lsattr -El hdisk? shows the current queue_depth setting. queue_depth is a disruptive change and requires a reboot.

For Fibre Channel, the adapter also has an in-process queue, which can hold up to num_cmd_elems of I/Os. The adapter submits the I/Os to the disk subsystem and it uses direct memory access (DMA) to perform the I/O. You may need to consider changing two settings on the adapter. By default num_cmd_elems is set to 200 and max_xfer_size is set to 0x100000. The latter equates to a DMA size of 16 MB. For a heavy I/O load, I increase the DMA to 0x200000 (128 MB) and I’ve set num_cmd_elems as high as 2,048, although I normally start at 1,024. This has to be done before the hdisks, etc., are assigned or you’ll have to rmdev them all to set these values. lsattr -El fcs? shows the current settings. Before changing these, check with your disk vendor. The fcstat command can be used to monitor these. Look for entries like:

FC SCSI Adapter Driver Information
No DMA Resource Count: 0
No Adapter Elements Count: 2567
No Command Resource Count: 34114051

In the above, it’s clear that num_cmd_elems isn’t high enough and that the DMA area also needs increasing. This is a disruptive change that requires a reboot.

When using VIO servers, max_xfer_size and num_cmd_elems should be set on the VIO servers and, if using N_Port ID Virtualization (NPIV), they’ll also need to be set on the NPIV client LPARs. Don’t set the values on the NPIV client LPAR higher than the VIO servers; I tried this and my LPAR wouldn’t boot, which was probably lucky, as I am sure there would have been overruns.


Jaqui Lynch is an independent consultant, focusing on enterprise architecture, performance and delivery on Power Systems with AIX and Linux.

comments powered by Disqus



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

Disk I/O and the Network

Increase performance with more tips for AIX 5.3, 6.1 and 7


Paging, Memory and I/O Delays

How to tune AIX versions 5.3, 6.1 and 7 for increased performance

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters