AutoPartition v4
AutoPartition allows tables to grow easily to large sizes by automatic partitioning management. This capability uses features of BDR such as low-conflict locking of creating and dropping partitions.
You can create new partitions regularly and then drop them when the data retention period expires.
BDR management is primarily accomplished by functions that can be called by SQL.
All functions in BDR are exposed in the bdr
schema. Unless you put it into
your search_path, you need to schema-qualify the name of each function.
Auto creation of partitions
bdr.autopartition()
creates or alters the definition of automatic
range partitioning for a table. If no definition exists, it's created.
Otherwise, later executions will alter the definition.
bdr.autopartition()
doesn't lock the actual table. It changes the
definition of when and how new partition maintenance actions take place.
PGD AutoPartition leverages underlying Postgres features that allow a partition to be attached or detached/dropped without locking the rest of the table (Autopartion currently only supports this when used with 2nd Quadrant Postgres 11).
An ERROR is raised if the table isn't RANGE partitioned or a multi-column partition key is used.
A new partition is added for every partition_increment
range of values, with
lower and upper bound partition_increment
apart. For tables with a partition
key of type timestamp
or date
, the partition_increment
must be a valid
constant of type interval
. For example, specifying 1 Day
causes a new
partition to be added each day, with partition bounds that are one day apart.
If the partition column is connected to a snowflakeid
, timeshard
, or ksuuid
sequence,
you must specify the partition_increment
as type interval
. Otherwise,
if the partition key is integer or numeric, then the partition_increment
must be a valid constant of the same datatype. For example, specifying
1000000
causes new partitions to be added every 1 million values.
If the table has no existing partition, then the specified
partition_initial_lowerbound
is used as the lower bound for the first
partition. If you don't specify partition_initial_lowerbound
, then the system
tries to derive its value from the partition column type and the specified
partition_increment
. For example, if partition_increment
is specified as 1 Day
,
then partition_initial_lowerbound
is set to CURRENT
DATE. If partition_increment
is specified as 1 Hour
, then
partition_initial_lowerbound
is set to the current hour of the current
date. The bounds for the subsequent partitions are set using the
partition_increment
value.
The system always tries to have a certain minimum number of advance partitions.
To decide whether to create new partitions, it uses the
specified partition_autocreate_expression
. This can be an expression that can be evaluated by SQL,
which is evaluated every time a check is performed. For example,
for a partitioned table on column type date
, if
partition_autocreate_expression
is specified as DATE_TRUNC('day',CURRENT_DATE)
,
partition_increment
is specified as 1 Day
and
minimum_advance_partitions
is specified as 2, then new partitions are
created until the upper bound of the last partition is less than
DATE_TRUNC('day', CURRENT_DATE) + '2 Days'::interval
.
The expression is evaluated each time the system checks for new partitions.
For a partitioned table on column type integer
, you can specify the
partition_autocreate_expression
as SELECT max(partcol) FROM
schema.partitioned_table
. The system then regularly checks if the maximum value of
the partitioned column is within the distance of minimum_advance_partitions * partition_increment
of the last partition's upper bound. Create an index on the partcol
so that the query runs efficiently.
If the partition_autocreate_expression
isn't specified for a partition table
on column type integer
, smallint
, or bigint
, then the system
sets it to max(partcol)
.
If the data_retention_period
is set, partitions are
dropped after this period. Partitions are dropped at the same time as new
partitions are added, to minimize locking. If this value isn't set, you must drop the partitions manually.
The data_retention_period
parameter is supported only for timestamp (and
related) based partitions. The period is calculated by considering the upper
bound of the partition. The partition is either migrated to the secondary
tablespace or dropped if either of the given period expires, relative to the
upper bound.
By default, AutoPartition manages partitions globally. In other words, when a
partition is created on one node, the same partition is also created on all
other nodes in the cluster. So all partitions are consistent and guaranteed to
be available. For this, AutoPartition makes use of Raft. You can change this behavior
by passing managed_locally
as true
. In that case, all partitions
are managed locally on each node. This is useful for the case when the
partitioned table isn't a replicated table and hence it might not be necessary
or even desirable to have all partitions on all nodes. For example, the
built-in bdr.conflict_history
table isn't a replicated table and is
managed by AutoPartition locally. Each node creates partitions for this table
locally and drops them once they are old enough.
You can't later change tables marked as managed_locally
to be managed
globally and vice versa.
Activities are performed only when the entry is marked enabled = on
.
You aren't expected to manually create or drop partitions for tables managed by AutoPartition. Doing so can make the AutoPartition metadata inconsistent and might cause it to fail.
Configure AutoPartition
The bdr.autopartition
function configures automatic partitioning of a table.
Synopsis
Parameters
relation
— Name or Oid of a table.partition_increment
— Interval or increment to next partition creation.partition_initial_lowerbound
— If the table has no partition, then the first partition with this lower bound andpartition_increment
apart upper bound is created.partition_autocreate_expression
— Used to detect if it's time to create new partitions.minimum_advance_partitions
— The system attempts to always have at leastminimum_advance_partitions
partitions.maximum_advance_partitions
— Number of partitions to be created in a single go once the number of advance partitions falls belowminimum_advance_partitions
.data_retention_period
— Interval until older partitions are dropped, if defined. This value must be greater thanmigrate_after_period
.managed_locally
— If true, then the partitions are managed locally.enabled
— Allows activity to be disabled or paused and later resumed or reenabled.
Examples
Daily partitions, keep data for one month:
Create five advance partitions when there are only two more partitions remaining (each partition can hold 1 billion orders):
Create one AutoPartition
Use bdr.autopartition_create_partition()
to create a standalone AutoPartition
on the parent table.
Synopsis
Parameters
relname
— Name or Oid of the parent table to attach to.partname
— Name of the new AutoPartition.lowerb
— The lower bound of the partition.upperb
— The upper bound of the partition.nodes
— List of nodes that the new partition resides on.
Stopping automatic creation of partitions
Use bdr.drop_autopartition()
to drop the auto-partitioning rule for the
given relation. All pending work items for the relation are deleted and no new
work items are created.
Parameters
relation
— Name or Oid of a table.
Drop one AutoPartition
Use bdr.autopartition_drop_partition
once a BDR AutoPartition table has been
made, as this function can specify single partitions to drop. If the partitioned
table was successfully dropped, the function returns true
.
Synopsis
Parameters
relname
— The name of the partitioned table to drop.
Notes
This places a DDL lock on the parent table, before using DROP TABLE on the chosen partition table.
Wait for partition creation
Use bdr.autopartition_wait_for_partitions()
to wait for the creation of
partitions on the local node. The function takes the partitioned table name and
a partition key column value and waits until the partition that holds that
value is created.
The function only waits for the partitions to be created locally. It doesn't guarantee that the partitions also exists on the remote nodes.
To wait for the partition to be created on all BDR nodes, use the
bdr.autopartition_wait_for_partitions_on_all_nodes()
function. This function
internally checks local as well as all remote nodes and waits until the
partition is created everywhere.
Synopsis
Parameters
relation
— Name or Oid of a table.bound
— Partition key column value.
Synopsis
Parameters
relation
— Name or Oid of a table.bound
— Partition key column value.
Find partition
Use the bdr.autopartition_find_partition()
function to find the partition for the
given partition key value. If partition to hold that value doesn't exist, then
the function returns NULL. Otherwise Oid of the partition is returned.
Synopsis
Parameters
relname
— Name of the partitioned table.searchkey
— Partition key value to search.
Enable or disable AutoPartitioning
Use bdr.autopartition_enable()
to enable AutoPartitioning on the given table.
If AutoPartitioning is already enabled, then no action occurs. Similarly, use
bdr.autopartition_disable()
to disable AutoPartitioning on the given table.
Synopsis
Parameters
relname
— Name of the relation to enable AutoPartitioning.
Synopsis
Parameters
relname
— Name of the relation to disable AutoPartitioning.
Synopsis
Return the id
of the last workitem successfully completed on all nodes in the
cluster.
Check AutoPartition workers
From using the bdr.autopartition_work_queue_check_status
function, you can
see the status of the background workers that are doing their job to maintain
AutoPartitions.
The workers can be seen through these views:
autopartition_work_queue_local_status
autopartition_work_queue_global_status
Synopsis
Parameters
workid
— The key of the AutoPartition worker.local
— Check the local status only.
Notes
AutoPartition workers are always running in the background, even before the
bdr.autopartition
function is called for the first time. If an invalid worker ID
is used, the function returns unknown
. In-progress
is the typical status.
- On this page
- Auto creation of partitions