MySQL Partitioning Notes

This is continued from the mysql/rsyslog project. The tables in question is described below. The goal is to have a partition for each day or week and then to drop old partitions after ~6 weeks as it will amazingly faster to remove old data this way. One thing to keep in mind is that this may significantly increase the number of file descritpors that MySQL will need depending on the number of partitions you create so, be sure to increase those if you start to notice any problems.

You can determine if your implementation of MySQL supports partitioning with 1 of the following statement:

SHOW VARIABLES LIKE '%partition%';
SHOW PLUGINS;

Default Rsyslog Table Structure
CREATE DATABASE Syslog;
USE Syslog;
CREATE TABLE SystemEvents
(
        ID int unsigned not null auto_increment primary key,
        CustomerID bigint,
        ReceivedAt datetime NULL,
        DeviceReportedTime datetime NULL,
        Facility smallint NULL,
        Priority smallint NULL,
        FromHost varchar(60) NULL,
        Message text,
        NTSeverity int NULL,
        Importance int NULL,
        EventSource varchar(60),
        EventUser varchar(60) NULL,
        EventCategory int NULL,
        EventID int NULL,
        EventBinaryData text NULL,
        MaxAvailable int NULL,
        CurrUsage int NULL,
        MinUsage int NULL,
        MaxUsage int NULL,
        InfoUnitID int NULL ,
        SysLogTag varchar(60),
        EventLogType varchar(60),
        GenericFileName VarChar(60),
        SystemID int NULL
);
CREATE TABLE SystemEventsProperties
(
        ID int unsigned not null auto_increment primary key,
        SystemEventID int NULL ,
        ParamName varchar(255) NULL ,
        ParamValue text NULL
);
mysql> describe SystemEvents;
+--------------------+------------------+------+-----+---------+----------------+
| Field              | Type             | Null | Key | Default | Extra          |
+--------------------+------------------+------+-----+---------+----------------+
| ID                 | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| CustomerID         | bigint(20)       | YES  |     | NULL    |                |
| ReceivedAt         | datetime         | YES  |     | NULL    |                |
| DeviceReportedTime | datetime         | YES  |     | NULL    |                |
| Facility           | smallint(6)      | YES  |     | NULL    |                |
| Priority           | smallint(6)      | YES  |     | NULL    |                |
| FromHost           | varchar(60)      | YES  |     | NULL    |                |
| Message            | text             | YES  |     | NULL    |                |
| NTSeverity         | int(11)          | YES  |     | NULL    |                |
| Importance         | int(11)          | YES  |     | NULL    |                |
| EventSource        | varchar(60)      | YES  |     | NULL    |                |
| EventUser          | varchar(60)      | YES  |     | NULL    |                |
| EventCategory      | int(11)          | YES  |     | NULL    |                |
| EventID            | int(11)          | YES  |     | NULL    |                |
| EventBinaryData    | text             | YES  |     | NULL    |                |
| MaxAvailable       | int(11)          | YES  |     | NULL    |                |
| CurrUsage          | int(11)          | YES  |     | NULL    |                |
| MinUsage           | int(11)          | YES  |     | NULL    |                |
| MaxUsage           | int(11)          | YES  |     | NULL    |                |
| InfoUnitID         | int(11)          | YES  |     | NULL    |                |
| SysLogTag          | varchar(60)      | YES  |     | NULL    |                |
| EventLogType       | varchar(60)      | YES  |     | NULL    |                |
| GenericFileName    | varchar(60)      | YES  |     | NULL    |                |
| SystemID           | int(11)          | YES  |     | NULL    |                |
| checksum           | int(11)          | NO   |     | 0       |                |
+--------------------+------------------+------+-----+---------+----------------+
25 rows in set (0.00 sec)

mysql> describe SystemEventsProperties;
+---------------+------------------+------+-----+---------+----------------+
| Field         | Type             | Null | Key | Default | Extra          |
+---------------+------------------+------+-----+---------+----------------+
| ID            | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| SystemEventID | int(11)          | YES  |     | NULL    |                |
| ParamName     | varchar(255)     | YES  |     | NULL    |                |
| ParamValue    | text             | YES  |     | NULL    |                |
+---------------+------------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

Normally, you would create the partitions when you defined the tables:


CREATE DATABASE Syslog;
USE Syslog;
CREATE TABLE SystemEvents
(
        ID int unsigned not null auto_increment primary key, <- Primary keys may be an issue with partitioning 
        CustomerID bigint,
        ReceivedAt datetime NULL,
        DeviceReportedTime datetime NULL,
	[...]
        SysLogTag varchar(60),
        EventLogType varchar(60),
        GenericFileName VarChar(60),
        SystemID int NULL
)
  PARTITION BY RANGE( DATE(DeviceReportedTime)+0 ) ( <= cannot partition CSV or MERGE tables
	PARTITION p0 VALUES LESS THAN(20101221),
	PARTITION p1 VALUES LESS THAN(20101222),
	PARTITION p2 VALUES LESS THAN(20101223),
	PARTITION p3 VALUES LESS THAN(20101224),
	PARTITION p4 VALUES LESS THAN(MAXVALUE)
);

Also see DATE, TIME, TIMESTAMP, DATETIME, QUARTER, MONTH, WEEK and YEARWEEK.
A list of Date and Time functions is available at dev.mysql.com.

Adding partitions to an existing table

If a primary key and/or unique key is defined, then the fields used for partitioning must be included in the primary/unique key.

ALTER TABLE SystemEvents DROP PRIMARY KEY; ALTER TABLE SystemEvents ADD PRIMARY KEY (ID,DeviceReportedTime);

You can only partition on integers when using RANGE, HASH and LIST (KEYS should work) with MySQL 5.1 (See Bug #13436). Conveniently, I wanted to partition by day using DATE()+0 but, MySQL would not let me partition on this (HOUR() was fine for some reason). I ended up using TO_DAYS.

This example partitions the existing Syslog table by RANGE using LESS THAN values: ALTER TABLE SystemEvents PARTITION BY RANGE(YEARWEEK(DeviceReportedTime)) ( PARTITION p201051 VALUES LESS THAN(201051), PARTITION p201052 VALUES LESS THAN(201052), PARTITION p201101 VALUES LESS THAN(MAXVALUE)
);
This example partitions the existing Syslog table based on the hour of day the event was recorded: ALTER TABLE SystemEvents PARTITION BY LIST(HOUR(DeviceReportedTime)) ( PARTITION p0 VALUES IN(0), PARTITION p1 VALUES IN(1), [...] PARTITION p22 VALUES IN(22), PARTITION p23 VALUES IN(23) ); This example was supposed to partition the table by calendar days using YYYYMMDD instead of YYYY-MM-DD format to meet the integer requirement but, it did not work: ALTER TABLE SystemEvents PARTITION BY LIST(DATE(DeviceReportedTime)+0) ( PARTITION p20100101 VALUES IN(20100101), PARTITION p20100102 VALUES IN(20100102), [...] );

Adding additonal partitions as needed

ALTER TABLE SystemEvents ADD PARTITION ( PARTITION p201102 VALUES LESS THAN(MAXVALUE)
);
ALTER TABLE SystemEvents ADD PARTITION ( PARTITION p201102 VALUES IN(201102)
);

Note: ADD PARTITION and DROP PARTITION do not currently support IF [NOT] EXISTS. It is also not possible to rename a partition or a partitioned table. Instead, if you wish to rename a partition, you must drop and re-create the partition; if you wish to rename a partitioned table, you must instead drop all partitions, rename the table, and then add back the partitions that were dropped.

Deleting partitions

The DROP PARTITION command is DDL in nature, executes almost instantly and greatly outpaces the standard DELETE command. In case you're wondering, the creation of an index on the date column in the non-partitioned SystemEvents table didn't help the DELETE run any faster (at least on my test box).

ALTER TABLE SystemEvents DROP PARTITION p20101221; The DROP PARTITION statement above is much faster than the DELETE example below: DELETE FROM SystemEvents WHERE DeviceReportedTime > DATE '2010-12-20' AND DeviceReportedTime < DATE '2010-12-22';

Re-combine partitions

You can also combine existing partitions into one partition like so:

ALTER TABLE SystemEvents PARTITION BY KEY(id) PARTITIONS 1;

Or remove partitioning completely (no data loss) from the table with:

ALTER TABLE SystemEvents REMOVE PARTITIONING;

Show partitions

You can show existing partitions with either of the following:

SELECT partition_name FROM information_schema.partitions WHERE table_name ='SystemEvents'; EXPLAIN PARTITIONS SELECT * FROM SystemEvents;

You can also check partitions with the following: ALTER TABLE SystemEvents CHECK PARTITION p1; Which will retun one of the following depending on wether or not the partition exists

+---------------------+-------+----------+----------+
| Table               | Op    | Msg_type | Msg_text |
+---------------------+-------+----------+----------+
| Syslog.SystemEvents | check | status   | OK       |
+---------------------+-------+----------+----------+
1 row in set (0.00 sec)
- OR -
+---------------------+-------+----------+----------------------------------------------------+
| Table               | Op    | Msg_type | Msg_text                                           |
+---------------------+-------+----------+----------------------------------------------------+
| Syslog.SystemEvents | check | error    | Error in list of partitions to Syslog.SystemEvents |
+---------------------+-------+----------+----------------------------------------------------+
1 row in set (0.00 sec)

This may be useful as a workaround until partitioning supports IF EXISTS.
Also see http://dev.mysql.com/doc/refman/5.1/en/partitioning-info.html

Partition management statements

ALTER TABLE SystemEvents DROP PARTITION p0;
ALTER TABLE SystemEvents ADD PARTITION (PARTITION p1);
ALTER TABLE SystemEvents REORGANIZE PARTITION ...;
ALTER TABLE SystemEvents COALESCE PARTITION p0;
ALTER TABLE SystemEvents REBUILD PARTITION p0;
ALTER TABLE SystemEvents OPTIMIZE PARTITION p0;
ALTER TABLE SystemEvents CHECK PARTITION p0;
ALTER TABLE SystemEvents ANALYZE PARTITION p0;
ALTER TABLE SystemEvents REPAIR PARTITION p0;
ALTER TABLE SystemEvents TRUNCATE PARTITION p0;

Rotate Partitions

Some products, such as LogZilla, use the MySQL event scheduler to rotate partitions. You can use the commands below to check if the event scheduler is supported and then to turn it on. Note: You will also need to enable the event scheduler in your /etc/mysql/my.cnf file so that it persists across restarts.

SELECT @@event_scheduler;
SET GLOBAL event_scheduler = 1;
Then you can create an event using the CREATE EVENT syntax (IN PROGRESS - DO NOT USE): CREATE EVENT Rotate_SyslogEvents_Partitions ON SCHEDULE EVERY 1 DAY ON COMPLETION PRESERVE DO DECLARE OldPartition VARCHAR(9); DECLARE NewPartition VARCHAR(9); SELECT DATE_FORMAT(DATE_SUB(CURDATE(), INTERVAL 6 WEEK), 'p%Y%m%d') INTO Old_Partition; SELECT DATE_FORMAT(DATE_ADD(CURDATE(), INTERVAL 1 DAY), 'p%Y%m%d') INTO New_Partition; ALTER TABLE SystemEvents DROP PARTITION Old_Partition; ALTER TABLE SystemEvents ADD PARTITION (PARTITION New_Partition); Events can be altered with ALTER EVENT, events can be dropepd with DROP EVENT and events can be viewed with: SELECT * FROM mysql.event\G

Script to rotate partitions

#!/bin/sh NEW=`date +"%k" --date="next hour" | sed 's/ *//'` OLD=`date +"%k" --date="6 hours ago" | sed 's/ *//'` echo "Creating p$NEW and deleting p$OLD" /usr/bin/mysql --batch -u user -psecret -DSyslog -f << eof ALTER TABLE SystemEvents ADD PARTITION ( PARTITION p$NEW VALUES IN($NEW)); ALTER TABLE SystemEvents DROP PARTITION p$OLD ; eof

The following is a longer version for daily rotation with some error checking:

#!/bin/bash # This script rotates the MySQL partitions for the SystemEvents Table in the # Syslog Databse. # # New partitions will be created in advance by the Check_Partitions functions. # Partitions older than 6 weeks (defined by $OLD variable) will be dropped by # the Delete_Partitions function # # Changing the rotation schedule will require that the exisitng partitions on # the SystemEvents Table be removed and re-defined. #-------------------------------------# # Global Variables # #-------------------------------------# # Define MySQL varaiables DATABASE=Syslog TABLE=SystemEvents USER=user PASS=secret # Define partition to drop by age in YYYYMMDD format OLD=$(date +"%Y%m%d" --date="6 weeks ago") # Define number of days ahead to create partitions AHEAD=3 #-------------------------------------# # Functions # #-------------------------------------# Create_Partition() { # This function creates new partitions for $TABLE Table # Requires date string for VALUE definition e.g. YYYYMMDD for daily rotation # Partions will be named by the letter "p" followed by the VALUE definition e.g. pYYYYMMDD if [ -z $1 ]; then echo "No parameters passed to Create_Partition function" else local NEW=$1 Convert-DATE-TO_DAYS $NEW /usr/bin/mysql --batch -u $USER -p$PASS -D$DATABASE \ -e "ALTER TABLE $TABLE ADD PARTITION (PARTITION p$NEW VALUES IN($TO_DAYS));" if [ $? != 0 ]; then echo "ERROR: Oh-oh, create partition p$NEW failed!" /usr/bin/logger -p user.crit "$0 - Could not create MySQL $DATABASE/$TABLE p$NEW partition" else /usr/bin/logger -p user.info "$0 - Created MySQL $DATABASE/$TABLE p$NEW partition" fi fi } Delete_Partition() { # Deletes old partitions for $TABLE Table # Requires date string for VALUE definition e.g. YYYYMMDD for daily rotation # Partions will be named by the letter "p" folowed by the VALUE definition e.g. pYYYYMMDD if [ -z $1 ]; then echo "No parameters passed to Delete_Partition function" else local OLD=$1 /usr/bin/mysql --batch -u $USER -p$PASS -D$DATABASE -e "ALTER TABLE $TABLE DROP PARTITION p$OLD;" if [ $? != 0 ]; then echo "ERROR: Oh-oh, drop partition p$OLD failed!" /usr/bin/logger -p user.warn "$0 - Could NOT drop MySQL $DATABASE/$TABLE p$OLD partition" else /usr/bin/logger -p user.info "$0 - Dropped MySQL $DATABASE/$TABLE p$OLD partition" fi fi } Define_Partitions() { # This defines the $ALL_PARTITIONS variable with a list of the existing partitions in CSV format # Expected string after first grep is something like: # <field name="partitions">p20101221,p20101222,p20101223</field> # This would not be necessary in this script if MySQL partitioning supported IF [NOT] EXISTS ALL_PARTITIONS=$(/usr/bin/mysql --batch -u $USER -p$PASS -D$DATABASE -X \ -e "EXPLAIN PARTITIONS SELECT * FROM $TABLE;" \ | /bin/grep '<field name="partitions">' \ | /bin/sed 's/^.*>\(p.*\)<.*$/\1/'` echo "ALL_PARTITIONS = $ALL_PARTITIONS" } Convert-DATE-TO_DAYS() { # Convert date string into MySQL TO_DAYS (Days since 000-00-00) if [ -z $1 ]; then echo "No DATE parameter passed to Convert-DATE-TO_DAYS function." exit 1 else local DATE=$1 TO_DAYS=$(/usr/bin/mysql --batch -u $USER -p$PASS --skip-column-names \ -e "SELECT TO_DAYS('$DATE');") #echo "DATE-TO_DAYS: $DATE = $TO_DAYS" fi } Convert-TO_DAYS-DATE() { # Convert MySQL TO_DAYS fucntion (Days since 000-00-00) into YYYYMMDD date string if [ -z $1 ]; then echo "No TO_DAYS parameter passed to Convert-TO_DAYS-DATE function." exit 1 else local TO_DAYS=$1 DATE=$(date +%Y%m%d --date="0000-01-01 $TO_DAYS days") #echo "TO_DAYS-DATE: $DATE = $TO_DAYS" fi } #-------------------------------------# # MAIN # #-------------------------------------# /bin/echo "Rotating partitions for MySQL $TABLE Table in $DATABASE Database." /bin/echo "Partitions will be created $AHEAD days in advance and old partitions" /bin/echo -e "Will be dropped.\n" Define_Partitions /bin/echo -e "Current Partitions: $ALL_PARTITIONS\n" | /usr/bin/fold if [ -z $AHEAD ]; then echo "The number of days ahead to create partitions (AHEAD) is not defined" exit 1 else # Create new partitions if they do not exist for DAYS in $(seq 0 $AHEAD); do #local NEW=`date +"%Y%m%d" --date="$DAYS days"` NEW=`/bin/date +"%k" --date="$DAYS hours"` /bin/echo $ALL_PARTITIONS | /bin/grep p$NEW > /dev/null if [ $? != 0 ]; then /bin/echo "Partition p$NEW check: CREATING" Create_Partition $NEW else /bin/echo "Partition p$NEW check: OK" fi done fi if [ -z $OLD ]; then echo "The maximum age for partitions (OLD) is not defined" exit 1 else # This deletes the old partition matching $OLD if it exists /bin/echo $ALL_PARTITIONS | /bin/grep p$OLD > /dev/null if [ $? != 0 ]; then /bin/echo "Partition p$OLD check: DROPPED (OK)" else /bin/echo "Partition p$OLD check: DROPPING" Delete_Partition $OLD fi fi Define_Partitions /bin/echo -e "\nCurrent Partitions: $ALL_PARTITIONS" | /usr/bin/fold

Partitioning and File Descriptors

A MySQL database creates a number of different files in the MySQL data directory ( typically in /var/lib/mysql or /data/mysql).

File ExtensionPurpose
.frmTable definition
.MYDTable data
.MYITable indices

Partitioning a table will create both an .MYD and a .MYI for each partition e.g. my_table#P#my_partition.MYI and my_table#P#my_partition.MYD (at least for the MyISAM engine). Consequently, you are opening 3 different files for each partition.

It is possible that you could start seeing errors similar to this if you exceed the maximum amount of file descriptors allowed on your system:

ERROR 23 (HY000) at line 1: Out of resources when opening file './Syslog/SystemEvents#P#p20110116.MYD' (Errcode: 24)

You can check the number of open files with "lsof | grep mysqld" from the command line (also see "ulimit -a" or, on gentoo, "cat /proc/sys/fs/file-nr"). From the mysql console you can also run "STATUS;" to see open files or "SHOW VARIABLES LIKE '%file%';" and check the open files limit.

Add the following line to /etc/mysql/my.cnf which did increase the maximum file descriptors in mysql after restarting the service:

set-variable = open_files_limit=4096

Then following to increase the maximum number of file descriptors on the system:

host ~ # ulimit -n 5120

You may also be able to increase the maximum amount of file handles in /etc/security/limits.conf by adding:

mysql soft nofile 4096
mysql hard nofile 4096

Other ideas

We could copy the table every week to another e.g. SystemEvents -> SystemEvents_W1, W1 -> W2, etc. and then drop the W7 table. We could also name the tables after a date string and drop ones older htan so many weeks.

mysql> DROP TABLE System_Events_W6; mysql> RENAME TABLE IF NOT EXISTS SystemEvents_W5 TO SystemEvents_W6; [...] mysql> CREATE TABLE SystemEvents_W1 SELECT DATE(DeviceReportedTime) FROM SystemEvents WHERE DATE(DeviceReportedTime) >= '2010-12-20' AND DATE(DeviceReportedTime) <= '2010-12-21';

Links