AIX > Storage > Servers

New Low-Cost Storage Solutions for VM Administrators


An Old Nemesis Challenges VM and Analytic Administrators

Virtual Machines (VMs) have changed the landscape of IT around the world. The “Moore’s Law“ in compute power has exceeded the needs of many applications and individual users. Meanwhile the complexity of systems has increased making deploying physical hardware more difficult. VMs have changed than entire landscape for IT administrators, especially those deploying systems locally and for cloud based systems. VM deployments represent a $5 billion business annually. See Figure 1.

Source: IDC

VMs provide admins with the ability to quickly deploy, redeploy and expand operational capabilities of systems. For many years, the focus has been on enabling these three use cases. A single VM image can be used to redeploy a standard image for hundreds or even thousands of users; with each user not being able to use the entire computational capability, there are less physical servers, less probability of failure and a much faster rebuild time in the event of a failure. But this is no longer a challenge. Protecting machine states is the challenge.

As with traditional systems, backup, recovery and long-term retention have become the issues that keep VM administrators up at night. The average VM image is 100GB in size. Performing image backups of thousands of systems are beginning to drive petabytes of data. Most modern VM back-up applications use unstructured data storage as an easy repository for the large images. This has lead to a proliferation of NAS systems that can cost up to $500 per TB supporting a solution designed to reduce the cost of the total architecture. The old nemesis of the systems administrator rears its ugly head: IT budgets are being consumed by rarely touched images.

New Storage Solutions

New solutions in low-cost storage offer administrators a way to reduce spend on rarely touched or cold data. IBM Spectrum Archive provides a solution that in many ways is more capable than a NAS solution. Spectrum Scale plus Spectrum Archive provide a NAS interface to an unstructured repository that migrates all data to physical tape. Tape can stream at 705 MBpS per connection and up to 135GBpS in a single tape solution. Since VM images are very large and must be complete images before being redeployed, a tape-based system can have an entire image ready for deployment in just over two minutes.

As large as the VM image back-ups are, there’s an even bigger challenge for the analytic engines of the world. The key to a cognitive computing world is the ability to retain data. Many analytic engines “crunch” a relatively small amount of data in any single sequence. Once the sequence is complete, the data has to be offloaded to free up space for the next sequence. A great example is Neteeza: the average Neteeza Archive image is 2TB of data.

Neteeza archives the 2TBs of data as an unstructured data image, traditionally pushing the image to a NAS storage system. In all cases the data is used as a glob and is pulled back in to the Neteeza system completely before being sequenced. The outlined performance is critical to these systems to allow more sequencing in shorter periods of time. Since the images are very large they can quickly create a petabyte of data storage.

Performance of a solution can be purchased, and flash can be used to make a redeploy instantaneous. However, in this use case, storage infrastructure is growing at an average of 35 percent year to year, well beyond the average 5 percent growth in IT budgets each year. Reducing the long-term storage cost can fund high performance storage. As an IT administrator, being able to improve performance within a contained budget is challenge. Spectrum Archive provides the solution with a cost of less than 0.3 cents per GB per month.

IDC, Worldwide Virtual Machine and Cloud System Software Market Shares, 2014: Open Source Disruption in Cloud

That is lower than any other storage medium for large-scale data, not even public cloud can compete. An analysis of how images are retained has demonstrated that select images are being retained for long periods of time, for instance in many cases enterprises are requiring monthly image captures to be retained for two years. An enterprise with 1500 VM images will retain 4 PB of images on a rolling two-year basis, with 1.5PB in weekly rotation. Comparing low-cost solutions reveals that if only 5 percent of the data is ever recalled Spectrum Archive is as low as one-tenth the cost of public cloud providers. (See Table 1)

Those are compelling numbers, but the analytic space must keep data forever to enable trends to be more reliably analyzed. According to an IDC Worldwide Storage Big Data forecast, accounts will account for over 73 Exabytes of data per year by 2019. See Figure 2. That is a compound annual growth rate of over 29 percent. Given that IT budgets are growing on an average of only 6 percent yearly, according to Jason Buffington an ESG principal analyst (, the need to reduce storage costs is a high priority for nearly every IT admin.

Transitioning Seamlessly

In relative terms cognitive applications are crunching on a very small subset of the massive data repositories required to do the trend analysis. Hadoop file systems rely solely on disk based storage to retain data. It is no secret that keeping many petabytes of data on shared-nothing-clusters gets very expensive, given the need for three copies of data. Spectrum Scale and Spectrum Archive enable the seamless transition between high performance analytic requirements and the massive long-term storage requirements. The transition of data is seamless between the tiers of storage, and the data is always available to the analytics engine. The automated tiering drives down the cost of infrastructure and management, with the most significant savings on multiple copy and long-term retention data.

Reducing Costs

The bigger story for Cognitive computing is the Return on Investment Capital (ROIC). In a 2-PB-five-year retention model, Spectrum Scale combined with Spectrum Archive were able to reduce the cost per month of storing data by almost $40,000, according an IBM internal study, over a traditional disk based storage environment. Given that 80 percent of the data is second/third copy or data that is untouched for long periods of time this only makes sense. What is not obvious is that the ROIC enables faster compute capability, by funding an all-Flash cognitive engine.

Configuring Storage in a New Way

The strength of Software Defined Storage⎯combined with industry leading hardware⎯is providing customers with new ways to configure storage. Data storage is no longer the secondary thought. It is the new natural resource. Like any natural resource data must be preserved for the next generation. Whether preserving the data is in new use cases of Virtual Machines, or the next generation of cognitive computing, the challenges of meeting budgets and end user expectations will always be present. Leveraging the most optimized and agile cost to performance model with the IBM solutions, provide the highest return on investment.

Like what you just read? To receive technical tips and articles directly in your inbox twice per month, sign up for the EXTRA e-newsletter here.

comments powered by Disqus



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

A Brave New World of Information Archiving

IBM unveils new information-retention strategy and solution-portfolio offerings

A Real-World Example of Boot from SAN

Lockheed Martin tests IBM technology for availability.

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters