Index-Controlled Partitioning vs. Table-Controlled Partitioning
Note: This is the final article of a two-part series. The first article appeared in the July/August issue.
In the previous article "DB2 TableSpace - Is it Time to Redesign?", I outlined the available table space and index design options associated with DB2* V8 and V9. These latest two releases offered new flexibility for storage-driven design options with features like table-controlled partitioning. It's important for database administrators to not only be comfortable with such new and improved offerings, but also how to best use them to meet business requirements.
In this article, we'll use the basic knowledge of DB2 and its new updates covered in the previous article and apply it to specific examples. Let's examine two case studies to show how the limitations in index-controlled partitioning impacted application designs and how these applications are designed today to take advantage of table-controlled partitioning to improve performance, availability and maintenance.
Before you begin the physical design you should understand the business requirements. This application will receive five billing files from five different billing systems with millions of daily transactions. Each day's data will be considered a billing cycle and you have to maintain 40 billing cycles. The billing system must process today's data and generate billing statements within 24 hours.
Meeting the requirement of processing this large data volume within 24 hours and automating the storage of 40 days of data requires teamwork.
Design for Partitioning
It was obvious that tables needed to be stored in an index-controlled partitioning table space. With the requirement of 40 days it seemed like there would be 40 partitions. What wasn't obvious is how to automate the data loading into the correct partition each day and purge the next day's partition without dropping and recreating the table space.
To support this I manufactured a partitioning key column that contained the correct partition number for the data to be loaded and processed daily. I took the current date the data was received and MOD this date by 40 to generate a partition number between 0 and 39. The MOD function is zero-based so physical partition 1 has a limit key of 0 and partition 40 had a limit key of 39. Before it starts processing each program generates a value for PARTNO and processes only the data for that date. To see how the MOD function works I've written the following SQL to simulate what these programs did internally.
SELECT DAYS(CURRENT DATE) AS CURRENT_DAYS
, DAYS(CURRENT DATE) (DAYS(CURRENT DATE)/40)*40 AS MOD_40
This design was prototyped and tested the concept of receiving and loading a day's worth of data and generating the bills. The program was written to generate the correct partition number to purge tomorrow's data and run a LOAD REPLACE with an empty dataset against the partition I wanted to clean out. The DDL located in Code Samples 1 through 3 is an example of how tables are built for this system.
The PARTNO is the generated column using the MOD 40 function. Because the MOD function is zero-based, the first partition limit key value on the create index is 00000 and partition 40 value is 00039. This design allows for the same account data to come in on different days. In this system an account can have more than one billing CYCLE.
Search our new 2013 Buyer's Guide.
Administrator | Real-world samples provide valuable lessons.
Cover Story | The latest DB2 for z/OS releases provide additional table space and index design options.