Unique partitioning scenario - roll in/roll out with selective retention
In a case like this, we can think of having a table with just 365 partitions (called P1, P2, ..., P365 say) and have a small mapping table which stores the mapping between transaction date and partition. In other words, have a mapping table such as:
Trans_Date Part_No
23-2-09 54
24-2-09 55
25-2-09 56
25-2-08 56
26-2-09 57
27-2-09 58
27-2-08 58
As we can see, for dates where we want to retain transactions, we do so by storing them in the corresponding partition for the current year. Thus, data for 25th Feb for ALL years will be stored in partition number 56, data for 26th Feb for ALL years will be stored in partition number 57 and so on.
Finally, this table has to be joined to the main transaction table (this can be hidden inside a view) for all queries to assist in partition elimination.
DB2 example follows:
Create table TP_Sales (
Trans_Date timestamp,
Partition_No int,
Trans_No int,
Trans_Amt decimal(5,2),
Retain_Flag int)
Partition by Range(Partition_No)
(Starting 1 Ending 366 every 1)
Create table Part_Hash (
Trans_Date timestamp not null primary key,
Partition_No int)
// Set up data
INSERT INTO PART_HASH
SELECT trans_date,
dayofyear(trans_date)
FROM
(SELECT to_date('31/12/2007','DD/MM/YYYY') + rn days as trans_date
FROM (SELECT row_number() over() as rn
FROM sysibm.columns
FETCH FIRST 730 ROWS ONLY) x
) y
// 10 rows for every day
INSERT INTO TP_SALES
SELECT trans_date,
dayofyear(trans_date),
trans_no,
trans_amt,
1
FROM
(SELECT to_date('31/12/2007','DD/MM/YYYY') + rn days as trans_date,
x2.trans_no,
x2.trans_amt
FROM (SELECT row_number() over() as rn
FROM sysibm.columns
FETCH FIRST 730 ROWS ONLY) x,
(SELECT row_number() over()+1000 as trans_no
,10.5 as trans_amt
FROM sysibm.columns
FETCH FIRST 10 ROWS ONLY) x2
) y
// Statistics
runstats on table tp_sales;
// We will need to create a view to hide the partition number from end users
CREATE VIEW SALES AS
SELECT A.TRANS_DATE,
TRANS_NO,
TRANS_AMT,
RETAIN_FLAG
FROM TP_SALES A JOIN PART_HASH B
ON A.TRANS_DATE = B.TRANS_DATE
AND A.PARTITION_NO = B.PARTITION_NO
Finally, a few typical queries to see if partition elimination is happening:
a. Select sum(trans_amt) From Sales where Trans_Date = current timestamp
Only some partitions being picked up (not known at compile time though)
b. Select max(trans_amt) From Sales where Trans_Date between (current date - 10 days) and current date
Only some partitions being picked up (not known at compile time though)
c. Select sum(trans_amt) From Sales where Trans_Date between (current date - 100 days ) and (current date - 40 days)
Only some partitions being picked up (not known at compile time though)
d. Select max(trans_amt) From TP_Sales where Partition_no between 45 and 55
As expected, only partitions 45 to 55 picked up.
