AIX > Administrator > DB2

What's New with DB2

In this article, which was originally published on IBM's DB2* Developer Domain, Matt Huras offers his perspective on DB2 Universal Database* (UDB) Version 8 for UNIX*, Windows* and Linux*. Huras is a lead architect and development manager of the data management group in DB2 development at the IBM Toronto Laboratory. His focus is on the DB2 UDB storage engine, including indexing, data management, buffering, OS services, utilities and high availability. Click here for details about DB2 v8.

Q: What are some highlights of DB2 v8?

A: There's a lot in this release, more than I can cover. The areas I'll focus on can be grouped into three major categories:

  • The first category covers high-availability features. This is an extremely important focal point for our customers as DB2 becomes the enterprise server of choice. Customers need data to be available almost constantly, so we made some enhancements to make that possible.
  • The second enhancement group focuses on making the administration task easier. It includes improved administration tools, performance advice and other usability and serviceability enhancements. Again, this is of critical importance to our customers who need to get more productivity out of their administrators by allowing them to focus on expanding the business and less on day-to-day tasks that can more easily be automated.
  • The third category includes continuous improvement in the bread-and-butter areas of performance and scalability. Included here are such things as the DB2 Common Client, key infrastructure improvements to improve scalability and memory usage, null and default value compression, a new index type and something that I think will please a lot of people, multidimensional clustering.

Q: What is multidimensional clustering?

A: Multidimensional clustering is a unique new feature that allows data to be clustered according to several different "dimensions" simultaneously. Although I'm using the term "dimension" in the data-warehousing sense, (as a way to view data from different dimensions such as by region or by year) this feature isn't limited to data warehousing. It's like defining multiple clustering indexes on the same table so that range scans on all those indexes get optimal disk access patterns. As you know, this isn't possible now but will be through multidimensional clustering with Version 8.

Q: It sounds interesting, but I'm not sure I completely understand. Could you expand on that a bit?

A: It's really a lot easier to explain this with pictures. If you look at Figure 1, you can see how clustering works today. All indexes are record-based, which means that clustering can only occur in one dimension, such as only by date or only by region. Even with the one-dimension clustering, there's no guarantee that you'll retain clustering, because it tends to degrade after the free space is exhausted (at least until you do the next reorganization of the data). These effects are shown in the diagram. The Region index is defined as the clustering index; you can see that most of its record IDs are clustered, but some aren't. The Year index can't be clustered, and this is certainly reflected in its record IDs.

As a result, a query such as SELECT * FROM T1 WHERE YEAR=99 may result in a significant amount of random and/ or sequential I/O and would typically pull in pages that can contain a mix of qualifying and unqualifying rows.


comments powered by Disqus



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

Data Basics, Part 2

More tips for avoiding common DB2 performance mistakes

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters