AIX > Administrator > Performance

Understanding and Tuning CPU Throughput

CPU Throughput

Few topics related to AIX performance analysis and tuning are more misunderstood than CPU throughput: how databases, applications, middleware and utilities use CPUs in an IBM Power Systems environment. What follows is a basic primer on determining CPU throughput and usage in your AIX systems. By combining this knowledge with performance statistics, you can tune your systems for the best performance.

Raw Versus Scaled Throughput

Power/AIX systems have two modes of CPU throughput: raw and scaled. Which mode is utilized in your systems at any given time is dependent upon on two factors: the default behavior of CPUs in the AIX environment, and how your executable is programmed for CPU usage.

Understanding the default behavior of CPUs in AIX requires a basic understanding of CPU architecture. Power Systems physical CPUs are partitioned into two or more hardware threads that are mapped to the same number of logical processors on a virtual CPU. This quantity depends upon the machine implementation type. Each physical CPU in a POWER5 or POWER6 system has two hardware threads on a physical CPU that are mapped to two logical processors on a virtual CPU, while a POWER7 CPU has four hardware threads mapped to four LPs and a POWER8 CPU has an eight/eight scheme.

Each of the hardware threads on any given CPU is named in an order of precedence. For example, consider a POWER7 CPU. The first of the four hardware threads in a physical POWER7 CPU is called the primary thread; the other hardware threads are called sibling threads, collectively, and are further distinguished individually as the secondary, tertiary and quaternary threads. Alternatively, you can apply numbers to these hardware threads: 0-3 for the first POWER7 CPU in an AIX system, 4-7 for the second CPU, 8-11 for the third, and so on. Again, each of these hardware threads is mapped to a logical processor on a virtual CPU; it's the virtual CPU to which working threads are bound. The Power Hypervisor then dispatches the virtual CPUs to run on physical CPUs where the threads actually do their work.

Okay, with me so far? The default behavior in AIX systems is for threads to utilize the primary hardware thread on any given CPU. When this hardware thread is saturated with work, that workload falls over to the next primary thread on the next CPU. The workload does not take advantage of the sibling threads in a CPU. This is called raw throughput mode. Most databases (Oracle, Sybase, Cache, etc.) use this scheme; they opt for the greatest throughput on any CPU versus better utilization of all the hardware threads on that CPU. In this way, most I/O operations (storage or network) are completed faster with this raw throughput method.

Contrast this with the other mode of operation: the scaled throughput mode. With scaled throughput, both the primary and sibling threads of any given CPU are activated to do work. Only when all four hardware threads are saturated (referencing our previous sample POWER7 CPU) will the workload fall over to the next CPU. Many application vendors choose scaled throughput to handle multiple concurrent calculations.

So how do you determine whether your workload is using raw or scaled throughput? Use “mpstat.” This stands for “multi-processor statistics,” and not enough administrators know about it. It's yet another of those extremely useful tools that gets overlooked even though it ships with every AIX distribution. While utilities like vmstat and iostat display aggregate CPU usage statistics, mpstat lets you evaluate the load on every logical processor in your system. With mpstat output, you can not only tell at a glance if the workload in your system is doing raw or scaled CPU throughput, you can also see the load each LP is under.

Let’s look at samples of mpstat from each type of workload. First, here's an LPAR with a workload using raw throughput. (I’ve omitted some lines for brevity):

lpar(/)#mpstat -w 3
cpu    min    maj    mpc    int     cs    ics     rq    mig   lpa   sysc    us    sy     
 wa       id    pc   %ec   lcs
  0      5      0      0    206    129      1      1      0 100.0    185  	 
60.0  40.0   0.0  0.0  0.00   0.0   160
  1      0      0      0     11      0      0      0      0     -      0   		 
0.0     6.0    0.0    94.0  0.00   0.0    11
  2      0      0      0     11      0      0      0      0     -      0   		 
0.0     4.6    0.0    95.4  0.00   0.0    11
  3      0      0      0     11      0      0      0      0     -      0   		 
0.0     4.3   0.0    95.7  0.00   0.0    11

Mark J. Ray has been working with AIX for 23 years, 18 of which have been spent in performance. His mission is to make the diagnosis and remediation of the most difficult and complex performance issues easy to understand and implement. Mark can be reached at

Like what you just read? To receive technical tips and articles directly in your inbox twice per month, sign up for the EXTRA e-newsletter here.

comments powered by Disqus



2018 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

Achieving a Resilient Data Center

Implement these techniques to improve data-center resiliency.


AIO: The Fast Path to Great Performance

AIX Enhancements -- Workload Partitioning

The most exciting POWER6 enhancement, live partition mobility, allows one to migrate a running LPAR to another physical box and is designed to move running partitions from one POWER6 processor-based server to another without any application downtime whatsoever.

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters