Tuesday, July 5, 2011

Table Partition

Table partitioning is a data organization scheme in which table data is divided across multiple data partitions or ranges according to values in a table column.

Benefits of table partitioning

There are numerous benefits of table partitioning:
  • To improve the scalability and manageability of large tables and tables in Database and Data Warehouse
  • Database and Data Warehouse that would benefit from easier roll-in and roll-out of table data
  • Furthermore, if a large table exists on a system with multiple CPUs, partitioning the table can lead to better performance through parallel operations.
  • A table with varying access patterns might be a concern for performance and availability when different sets of rows within the table have different usage patterns.
 The steps for creating a partitioned table include the following:
  1. Create a partition function to specify how a table or index that uses the function can be partitioned.
  2. Create a partition scheme to specify the placement of the partitions of a partition function on filegroups.
  3. Create a table or index using the partition scheme.
 
Below are the steps to creation Horizontal Table Partition

-- Create the partition function

CREATE PARTITION FUNCTION PartitionFunctionMonthly (int)
AS RANGE RIGHT
FOR VALUES (20100101, 20100201, 20100301)
GO

-- Add the partition scheme
CREATE PARTITION SCHEME PartitionSchemaMonthly
AS PARTITION PartitionFunctionMonthly
ALL TO ( [PRIMARY] )
GO

 -- Create a simple table
CREATE TABLE PartitionTable (
   DateKey int NOT NULL,
   CustomerKey int NOT NULL,
   SalesAmt money,
CONSTRAINT PKPartitionTable PRIMARY KEY NONCLUSTERED
   (DateKey, CustomerKey)
)
ON PartitionSchemaMonthly(DateKey)
GO

------------------------------------
-- Unit Testing of Partitions
------------------------------------
-- Add some rows
INSERT INTO PartitionTable (DateKey, CustomerKey, SalesAmt)
SELECT 20091201, 1, 5000 UNION ALL
SELECT 20100101, 2, 3000 UNION ALL
SELECT 20100215, 7, 6000 UNION ALL
SELECT 20100331, 5, 3000 UNION ALL
SELECT 20100415, 8, 6000
GO

-- A query accesses the entire table, exactly as you'd expect.
SELECT * FROM PartitionTable
GO

--Query partition contents
SELECT
$partition.PartitionFunctionMonthly(DateKey) AS [Partition#],
COUNT(*) AS RowCount,
Min(DateKey) AS MinDate,
Max(DateKey) AS MaxDate
FROM PartitionTable
GROUP BY $partition.PartitionFunctionMonthly(DateKey)
ORDER BY [Partition#]
GO