Skip to main content

Your Quick Start Guide for HA/DR Solutions on IBM Power Systems

Offering Manager Steve Finnes breaks down active/active, active/passive and active/inactive solutions.

Red and blue gradient with clocks and arrows.

We’ve all heard the phrase, “time is money.” In the case of downtime for a business, infrastructure or systems downtime can cost anywhere from hundreds of thousands of dollars an hour to millions of dollars per hour. We often tend to think of outages as related to hardware reliability, but according to Business Continuity Institute’s 2018 Horizon Scan Report, the top reasons for unplanned outages are human error, security issues, software bugs while hardware is further down the list. And let’s not forget what the major contributor to downtime is—planned outage management, the time allocated for hardware, software and facility maintenance, and daily backup operations.

Solutions to Minimize Downtime

So, how can we minimize downtime during these disruptions? There are three solution approaches in the market today. A high availability and disaster recovery (HA/DR) solution may be constructed purely out of one of them or via a combination of them. 

Active/Active Solutions

An active/active solution such as Db2 Mirror for i or pureScale provides a Recovery Time Objective (RTO) of 0. That is, there is no failover time, and Recovery Point Objective (RPO) of 0 i.e. no data loss. It is also the more expensive option because it requires redundant licensing for all the processor cores in the HA configuration. Given that an active/active solution requires updates to be synchronous to the application state at all times, the servers in this type of configuration need to be in close proximity. Therefore, this approach addresses HA only. Adding disaster recovery to the base configuration will require the use of a solution type from one of the other two categories.

There are other types of active/active where the data on the target system can be accessed for read operations. These are the IP based replication packages sold by third party vendors. They are known to be complex to manage, expensive to maintain and consume significant processing resources on both source and target systems.    

Active/Passive Solutions

Active/Passive configurations such as PowerHA involve a failover operation where the application is restarted on a secondary system for a failover operation. These failover operations are fast enough for most customer environments and only require a single license on the target system.

The license entitlements are transferred from the primary node to the secondary node. ISV licensing for these solutions may vary but most modern software packages recognize that in the case of active/passive, the production application needs to run only on one node in the cluster at a time. Both active/active and active/passive cover all data center outage types and in particular software maintenance.

Active/passive (PowerHA) also encompasses disaster recovery since the cluster nodes can be dispersed geographically. The RTO for active/passive is essentially the application restart time, and RPO is zero in deployments based on shared storage and/or synchronous data replication (ie Metro Mirror for IBM storage solutions). For geographic dispersion, the RPO will be close to but greater than zero.

Active/Inactive Solutions

The active/inactive configurations such as VM Recovery Manager, have a standby server that is unaware of the production environment. For example, the most basic type of active/inactive is restore from tape. With storage replication and virtual machine technology, we can instantiate production onto a target server via a boot-up or IPL process. Since the middleware and operating systems will only exist on one system at a time, there is no requirement for IBM entitlements to be on the target system. We can’t speak for the ISVs but, armed with the table below, you should be able to make the case that licensing should not be required on the target system.

Here’s a quick summary of the HA/DR solutions on IBM Power Systems:

Leading Causes of Downtime

Let’s go back to the leading causes of down time: human error and software bugs. This brings automation to the forefront. Ideally, you’ll want a solution that simplifies HA/DR operations with minimal human intervention. The solution should enable you to easily test your HA/DR options regularly so that you’re always ready for real outage event. The other considerations would be the ability to keep the planned outage management time to near zero and overall, your HA/DR solution should have minimal performance impact. IBM Power Systems deliver comprehensive state of the art solutions with these capabilities to meet your high availability and disaster recovery requirements.

Delivering the latest technical information to your inbox.