Enterprise-Class Cloud Storage Is Just a Tape Away
Delivering cost-effective backup and archive for the traditional data center has been the subject of much debate. In a 24x7 environment, it’s been about how fast organizations can restore files or whole virtual machines after a problem. Disk-based backup systems offer near-instant recovery, which has driven some IT shops to use disk-based solutions for regular backups. But while daily backups were a major focus, they’re now just a small part of the overall data-retention strategy in an enterprise. Today the discussion is more about how long data remains active and access requirements for objects that remain archived for long terms. Archives might exist for decades. These new requirements overshadow daily backups in sheer size and complexity.
IBM is delivering unique and leading-edge technologies into the growing archive and big data storage market. Announcements earlier in the year for IBM Spectrum Storage are shaping the definition of the ecosystem around the retention of things. In September 2015, IBM Spectrum Protect (formerly Tivoli Storage Manager) announced the most significant release in several years further defining the management of retained things. This month, IBM rounded out the retention announcements by delivering LTO 7 tape technology.
LTO 7 Highlights
While LTO 7’s delivery of an 87 percent performance improvement will be great for backup, it’s not the most interesting part of the announcement. The 140 percent capacity improvements will make a real difference to data-center economics and will probably drive tape to a dominant or disruptive storage position in the archive and big data space.
The improvements in storage density will lead to a 65 percent reduction on floor space making tape a very desirable resource for those with smaller machine rooms as well as green and cost-conscious data centers. Figure 1 shows the LTO improvements by generation.
There are also improvements to the TS4500 tape library support. Along with support for LTO 7 drives, IBM has increased the expandability to one base frame and up to 17 expansions. This drives up the library totals:
- LTO can now have 23,170 cartridges, which provides up to 139 PB per library (347.5 PB with 2.5:1 compression).
- 3592 can now have 17,750 cartridges, which provides up to 175.5 PB per library (526.5 PB with 3:1 compression).
- Drives supported has increase to 128.
This provides an 11x improvement for cloud data management.
LTO 7 also improves the small tape library footprints:
- IBM entry tape subsystem the TS2900 now has a very capable 54 TB native capacity.
- The midrange tape subsystems starting with the TS3100/3200 now support up to 288 TB, with the larger TS3310 achieving a massive 2.5 PB for a midrange footprint.
Yes, IBM hears the sighs, “tape is old tech; it’s dying” but tape is about to be a disruptive force in long-term data storage. Tape is a technology that has been around for a long time and it’s designed to last a long time. The wheel is old technology in an era where we can now fly, hover and float, but most travel still uses the wheel. Like the wheel, tape is cheap and efficient.
Tape has two current uses—one traditional and one evolving. The first and original use of tape is actually a combination of two data-retention requirements—backup and archive data.
Data backups are used to recover data that may have been corrupted by user error, application error or hardware failure. The recovery might be as simple as a single file or a whole single system restore. These backups are normal day-to-day tape operations and are being used by many clients throughout the world. The historical problem for backup operations is the possible need for application downtime and the speed of full system recovery.
More recently, the increasing number of virtualized environments has exacerbated the problem and driven the move to disk-based snapshots/backups. IBM Spectrum Protect can perform high-speed snapshots of the virtual infrastructure and perform near-instantaneous virtual machine restores to meet the new service levels required by 24x7 operations. This is a growing area and a place where some tape is being replaced by HDDs.
Using hard-drive-based solutions rather than tape presents a major economic conundrum for IT architects. The drive-based infrastructure costs significantly more than tape in most aspects (hardware, space, power and cooling costs). All these costs are because disks are continuously in motion, whereas tape is predominantly at rest.
Where’s the Proof?
A recent Enterprise Strategy Group (ESG) study offered insight into representative client environments. “A Comparative TCO Study: VTLs and Physical Tape - With a Focus on Deduplication and LTO-5 Technology By Mark Peters” shows that in each case, tape is more economic. One exception might be the need for very fast restore. If that’s the highest criteria, then it will outweigh the higher cost.
Figure 2 shows four scenarios from the ESG study. Within each scenario, a highlighted multiplier (between 2.46x and 4.16x) indicates the number of times more expensive the VTL offering is compared to the physical tape offering.
One note I should mention: One dimension of this study is deduplication, and it’s a very hot topic at the moment because of the potential cost savings. But a word of caution: You will only get ‘deduplication’ to occur if there are many saves of similar environments, or the same environment is saved many times. Additionally, not every object type responds well to deduplication. For example, videos and images don’t produce good deduplication results. So make sure you get the deduplication ratio right when you’re evaluating the benefits of deduplication to your environment.
The second use of tape in a traditional context is for archive data. In the past, this type of data was defined as data that’s inactive, and we would save it with the hope we never have to access it again. This inactive archive has changed its usage and now business applications may still want some access after the data has been archived. It could be a regulatory compliance, and eDiscovery request or a new application request to retrieve archived data. This brings in a new category of data retention called active archive.
This area of compliance is growing rapidly in enterprise businesses. The capability to access or recover the original data is vital. The cost of non-compliance can be catastrophic. The archive data might also be required as an active source of historical information that the business needs to make decisions.
The archive will be another area where there both disk- or tape-based archive solutions will be proposed. This growth is driving the new debate on whether to continue using tape archive or switch to a disk-based archive. The challenge for businesses and cloud service providers is how to contain multi-petabyte disk archive farms that will quickly become exabyte size. Some organizations are already dealing with this problem, and the problem of how to fund it within the flat IT budgets.
The Right Mix
With leading-edge tools like Spectrum Control and Spectrum Protect, organizations have the capability to analyze the usage within an archive. This can then help the user determine the best media for active archive. For example, it would be possible to implement a business policy that active data (live or active archive) remains on disk for a month, after which it is moved to tape. This may be tape storage for archive under Spectrum Protect control or tape storage that is a tier under Spectrum Scale’s control.
The Clipper Group has just repeated its annual study, “Continuing the Search for the Right Mix of Long-Term Storage Infrastructure - A TCO Analysis of Disk and Tape for Archive” and the results are stunning. The July 2015 study centered on LTO 6. Figure 3 illustrates that LTO 6 tape has a 6.18x benefit over a disk-based archive solution. The majority of the saving is derived from the low cost of tape cartridges compared to disk drives. The library, disk controller, energy cost and floor space costs, while measureable, become insignificant as the storage capacity grows.
The Clipper Group produced an update in September for LTO 7 using estimates for LTO 7 drive and cartridge cost. This update presents LTO 7 as still having a greater than 6x cost benefit over a disk-based solution.
New Factors for Tape
The final area where tape will become a disruptive technology is in the new object types being driven by emerging technologies related to social, analytic and machine data. LTO 7 and IBM Spectrum Archive (featuring Linear Tape File System) start to challenge the status quo when organizations see that the capacity of IBM Spectrum Archive grows to a massive 1.053 exabytes of cloud-based near-line storage. This new “cold storage” for emerging data types will be very important to managing the cost of cloud data.
Figure 4 is an example of the simplicity of the Spectrum Archive data flow as it’s ingested. Files are stored in a global namespace whiting the Spectrum Scale system. Based on business or IT policy files are migrated to the tape library while remaining visible to users and applications.
An example of this already exists in one academic institution that has around 12 PB of disk to house research data, and it is predicting a two-year growth to nearly 50 PB. The concern is that much of the data is rarely accessed, if at all. To maintain this data in a disk pool of this size doesn’t have sustainable or defensible model. The institution plans to use analytics to determine which data should remain on disk and which data should be active archived to tape.
New Media of Choice
Tape is far from dead and is becoming the new media of choice for cold data storage. It’s time to brush the dust off those tape management manuals and determine how this disruptive force in long-term storage can benefit your organization.
To find out more about IBM System Storage and software-defined storage that includes tape, visit IBM Spectrum Storage.
comments powered by