Engineered Together—IBM Java and zEC12 Boost Workload Performance

Transactional Memory

The zEC12 is the first general-purpose IBM server to incorporate transactional memory technology, first used commercially to help make the IBM Blue Gene/Q-based Sequoia system at Lawrence Livermore National Lab the fastest supercomputer in the world. In zEC12, IBM adapted this technology to enable software to better support concurrent operations that use a shared set of data, such as financial institutions processing transactions against the same set of accounts.

The zEC12’s Transactional Execution (TX) facility is an architectural framework that allows for lockless interlocked execution of a block of code called a transaction. A so-called transaction is a segment of code that appears to execute atomically to other CPUs, and as such, other processors in the system will either see all-or-none of the storage up-dates made by it.

Transactions are bound by TBEGIN and TEND instructions. A storage conflict is detected by the hardware if another CPU updates storage used by the transaction, as shown in Figure 1. The conflict triggers a transaction to abort, rolling back hardware state (general purpose registers and storage) to that which was observed at the TBEGIN instruction. Program execution is also rolled back to the instruction immediately following the TBEGIN, with TX now disabled. A transaction-failure condition-code is set such that program flow can be diverted to a transaction failure handler. This handler can choose to retry the transaction or perform traditional coarse-locking to guarantee forward progress of the program.

TX can be used to avoid coarse-grained-locking by providing a mechanism for safe, lock-free, concurrent execution of critical sections. Figure 2 provides an example of how lock-elision can be implemented using transactional execution. The right column of Figure 2 depicts how coarse locking imposes serialization on threads, despite the fact that the threads are only reading the hash. Serialization is required to safely handle updates to the hash writer, and results in a total execution time “T.” Lock elision exploits transactional memory to execute the reads in parallel while still safely falling back to serialized execution in situations where a writer is actually updating the hash.

The zEC12 TX facility includes a rich set of functions to enable a wide range of optimizations, such as lock-elision, multi-word compare-and-swap (CAS) primitives, and speculative optimizations.

IBM Java 7 SR3 on System z exploits the TX facility and improves the performance of the Concurrent Linked Queue (CLQ) class, which is part of the standard Java library. It’s a thread-safe, first-in-first-out unbounded linked queue data structure. Elements are offered into the tail and polled from the head, as seen in Figure 3.

The standard implementations of CLQ.offer() and CLQ.poll() methods apply non-blocking algorithms. They’re based on loops of atomic CASs, building blocks of many concurrent algorithms. IBM JVM compiles these algorithms into hundreds of instructions.

With the support of TX on IBM Java 7 SR3, JVM is able to replace CLQ.offer() and CLQ.poll() with leaner instruction sequences that employ transactional execution at run time. This greatly improves the efficiency of the algorithm by significantly reducing the path length and data access. Up to double the improvement in scalability is observed with a micro-benchmark illustrated in Figure 4.

The TX-based algorithm for CLQ can be enabled on IBM Java 7 SR3 by specifying the -Xaggressive flag on the command line. No modification of Java source code is required to take advantage of this feature. IBM plans to further exploit the TX facility in future releases of Java.

Clark Goodrich is a senior software engineer in the IBM Systems, IBM Z performance organization working on the Java compiler. He has been with IBM since 1978 and holds six IBM patents. Prior to joining IBM, he did real-time programming at NASA's Goddard Space Flight Center for the Space Shuttle.

Jerry Zheng is a staff software developer in Java Just-in-Time Compiler development who is leading the effort of supporting transactional memory in IBM J9 JVM. He joined IBM in 2008 and has been working on exploiting new System z hardware features in a dynamic compiler.

Marcel Mitran is a Distinguished Engineer and Chief Technology Officer for IBM LinuxONE. He is based out of the IBM Toronto Lab. Marcel has spent more than 15 years developing hardware and software technology to solve problems of the modern enterprise.

comments powered by Disqus



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.


Your Input Needed: IBM Systems Media Reader Survey

Educated for Success

Marist College students benefit from school's partnership with IBM

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
Mainframe News Sign Up Today! Past News Letters