Marin Komadina

Procedure for Adding New Database Partition

This system has performed very well, until we doubled the data volume which resulted in increased CPU and memory utilization. This can be solved by adding more memory and new CPUs. After that, we are going to create a new database partition on added the hardware and redistribute the user data across all partitions.

Hardware upgrade

We upgraded the hardware to an additional 2 CPUs, 4 GB RAM and additional disk space for new database partition.

Add new database partition

There's both an online and offline method for adding a new database partition to an existing database. Since, in both cases we need to restart the database to enable full Read Write access to the new partition, we are going to use the offline method. When we add new partition, temporary tablespace is automatically created on the new partition, and we need to decide where we want it to be located. Possibilities for temporary tablespace are:

  • NO INDICATION will use the temporary tablespace container definition from node 0
  • LIKE NODE will use temporary tablespaces with containers similar to what we already have on another partition
  • WITHOUT TABLESPACES will create a partition without temporary tablespace containers

NO INDICATION is the default value and we will use that option. Now we are ready to add new partition (2):

$ db2stop 
02-10-2003 16:51:13     0   0   SQL1064N  DB2STOP processing was successful.
02-10-2003 16:51:14     1   0   SQL1064N  DB2STOP processing was successful.
SQL1064N  DB2STOP processing was successful.

To register the new partition (2) we need to change the partition configuration file:

$ vi db2nodes.cfg
2 ARTIST0A1 2  -> new partition (2)

On virtual node 2, we will separately start and create the partition (2):

$  export DB2NODE=2

$ db2start nodenum 2
02-10-2003 17:08:59     2   0   SQL1063N  DB2START processing was successful.
SQL1063N  DB2START processing was successful.

$ db2 add node 
DB20000I  The ADD NODE command completed successfully.

The newly created partition has associated only temporary tablespace:

db2 => list tablespaces
           Tablespaces for Current Database
 Tablespace ID                        = 1
 Name                                 = TEMPSPACE1
 Type                                 = System managed space
 Contents                             = System Temporary data
 State                                = 0x0000
   Detailed explanation:

db2 => list tablespace containers for 1
             Tablespace Containers for Tablespace 1
 Container ID                         = 0
Name                                 = /export/home/db2inst1/db2inst1/NODE0002/SQL00001/SQLT0001.0
 Type                                 = Path

Next we will stop partition 2 and start the whole database:

$ db2stop nodenum 2
02-10-2003 17:12:16     2   0   SQL1064N  DB2STOP processing was successful.
SQL1064N  DB2STOP processing was successful.

$ export DB2NODE=0
$ db2start
02-10-2003 17:12:56     1   0   SQL1063N  DB2START processing was successful.
02-10-2003 17:12:57     0   0   SQL1063N  DB2START processing was successful.
02-10-2003 17:12:57     2   0   SQL1063N  DB2START processing was successful.
SQL1063N  DB2START processing was successful.

The next step is to extend node group GLOBAL to a new partition:

db2 => alter nodegroup global add node (2)
SQL1759W  Redistribute nodegroup is required to change data partitioning for 
objects in nodegroup "GLOBAL" to include some added nodes or exclude some 
dropped nodes.  SQLSTATE=01618

db2 => select * from sysibm.sysnodegroupdef
------------------ ------- ------
IBMCATGROUP              0 Y     
IBMDEFAULTGROUP          1 Y     
IBMDEFAULTGROUP          0 Y     
LOCAL                    0 Y     
GLOBAL                   2 A   -> active pending status    
GLOBAL                   1 Y     

Node group GLOBAL is distributed over two partitions, 1 and 2, and has active pending status. Objects in node group GLOBAL are still only on node 1.

Note: Node group IBMDEFAULTGROUP is the default node group for new database objects. It is recommended to extend this node group over all partitions, however, we are not going to do that in our test.

Redistribute data

Data redistribution over all defined partitions in the node group is the last step. Data redistribution must be applied from the catalog node. We are going to distribute data uniformly using the default hash function. Other methods for data distribution, (using distribution file or using target partition map), need better application and data knowledge than we have in this test.

db2 =>  redistribute nodegroup GLOBAL uniform
DB20000I  The REDISTRIBUTE NODEGROUP command completed successfully. db2 => 

In the case of tables without a partitioning key, distribution will not succeed and the partitioning key need not be unique.

Redistribution log is located on path: ~/sqllib/redist

                      Data Redistribution Utility
The following options have been specified:
Nodegroup name                  : GLOBAL
Data Redistribution option      : U
Redistribute Nodegroup          : uniformly
No. of nodes to be added        : 1
List of nodes to be added       : 2
No. of nodes to be dropped      : 0
List of nodes to be dropped     :
Delete will be done in parallel with the insert.
The execution of the Data Redistribution operation on:
 Begun at  Ended at  Table
  17.38.07            DB2INST1.PART_TEST            17.38.08
--All tables in the nodegroup have been successfully redistributed.-

Successfully finished data distribution will reset the status flag for the node group:

------------------ ------- ------
IBMCATGROUP              0 Y     
IBMDEFAULTGROUP          1 Y      
LOCAL                    0 Y     
GLOBAL                   2 Y
GLOBAL                   1 Y     

The PART_TEST table has records equally distributed over partitions in node group:

----------- -----------
          1        2560
          2        2560

Data distribution can be checked reading the partition map for table DB2INST1.PART_TEST:

>> db2gpmap -d ARTIST -m tab_distribution.txt -t DB2INST1.PART_TEST
Retrieving the partition map ID using table DB2INST1.PART_TEST
1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2
1 2 1 2 1 2 1 2 1 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1

For performance reasons, after redistribution IBM suggests statistics be updated on the redistributed objects:

db2 => runstats on table db2inst1.part_test with distribution and detailed indexes all shrlevel change
DB20000I  The RUNSTATS command completed successfully.



The main advantage of the DB2 UDB EEE database is optimal scalability. When you have more users, more data, and complex queries, just add additional hardware and DB2 will scale. Measurements for different database operations (load of all tables, index creation, runstats, single and multiple concurrent delete and insert operations) on duplicate hardware configurations give us an average scalability factor of 0,98. It is almost ideal, however, there are still some points that IBM improve on in the future:

  • Implement memory sharing between multiple partitions on the same physical node
  • Solve problem of expensive select or update on non partitioned key
  • Limit interprocess communication between partitions on multipartition installation
  • Implement partitioning models other than hash partitioning
  • Provide real online database expansion (after adding a new partition, full RW access to the newly added partition is possible only after database restart)
  • Enable non blocking data redistribution after adding or dropping partitions

» See All Articles by Columnist Marin Komadina