Bookmark and Share
RSS

Recent Posts

The Evolution of Db2 Compression

November 21, 2017

You don't hear much about data compression these days, but recently I encountered a customer who was curious about it. He said his company never used compression due to the belief that because the CPU overhead was too high, but he was wondering if the feature has improved, and if so, how can you determine which tables will benefit from compression?

This conversation made me wonder if other enterprises are operating under the misconception that compression drives up CPU costs. While this may have been true when data compression was first introduced (in Db2 Version 3, for the record), compression technology has greatly improved with each release. Today, Db2 data compression is processed by hardware and updated with each new hardware release. So, it's fast now, and it keeps getting faster.

I thought it would be fun to check some old IBM Redbooks and Db2 manuals and track the evolution of data and index compression. So for the next few posts, I'll provide an overview of data compression capabilities for each of the past few Db2 releases.

This week we'll start with Db2 Version 8. Note that prior to V8, DBAs would only consider using compression on very large table spaces. This was because Db2 managed compression for each table space partition through a dictionary. The dictionaries were stored below the 16-MB line (aka, "below the bar"). But with V8, compression dictionaries were allocated above the bar, eliminating storage constraints. This effectively opened up compression to be used on additional table spaces.

The compression dictionary for a compressed table space is loaded into virtual storage for each compressed table space or partition as it is opened. While it's only accessed infrequently, it of course occupies a good chunk of storage while the data set is open. A compression dictionary can occupy up to 64 KB of storage per data set (that's 16 4-KB pages). For customers with a large number of compressed table spaces, the compression dictionaries could use up as much as 500 MB. Therefore, moving the dictionary above the 2-GB bar provided significant storage relief.

V8 also implemented support for up to 4,096 partitions for a single table. So if all these partitions were open, you'd have 4,096 compression dictionaries in memory. This was another driver for moving compression dictionaries above 2 GB.

V8 was the first release where compression used standard 64-bit hardware compression instructions rather than being implemented using software. Again, this greatly reduced overhead.

DBAs needed to be careful when using DSN1COPY or DSN1COMP with compressed objects. With DSN1COPY, the source and target objects needed to be defined identically. Meanwhile, DSN1COMP would retrieve a row as-is. If you had pending changes, there was no attempt to convert the data to the latest version to estimate the amount of savings derived from compressing data.

Here are some good resources on Db2 for z/OS Version 8: Next week, I'll look at compression capabilities in Db2 version 9.

Posted November 21, 2017 | Permalink

comments powered by Disqus