Table partitioning is a database optimization technique that involves dividing a large database table into smaller, more manageable segments or partitions based on predefined criteria, such as a range of values in a specific column (e.g., dates or numerical ranges). Each partition is essentially a separate sub-table with its own storage, making it easier to maintain and query large datasets.

In most cases, using a partitioned table from day one is overkill. It not only makes the query optimizer’s life harder but also applies some restrictions on what you can or cannot do. There may be some exceptional cases, but we won’t focus on them today. This article is also not a guide about when to use partitioned tables. The goal of this article is to find the answer to a simple question: How do I partition a table that already holds lots of rows?

Problem statement

The question above is a little bit too generic, though. We need to ask ourselves a couple of additional questions:

  1. What will be the partition keys? How is data going to be accessed?
  2. Will partitions contain evenly distributed data? Will they be accessed evenly, or are there some partitions that will be accessed more often than others?
  3. Are the rows immutable? Does the table allow updates, or only create and read operations?
  4. What are the operational limitations we are facing? Is it OK to bring the database down during the migration? Do we need to support read-write operations throughout the process? Can we afford to block write operations and keep read operations working?

The answer to these and other questions will determine what is the right approach to partitioning, as they will allow us to cut some corners here and there, or will force us to use workarounds so as not to disrupt the service.

In the scope of this article, we will use an asynchronous email delivery service as a pet project. A user can write an email, recipients, and delivery time, and the system will deliver the email when the time comes (up to 2 years after the scheduling date). The deliveries table, used to store this information, will be our subject throughout the article. In the dashboard, users can see a list of upcoming emails, that can be edited or deleted. They can also search delivered emails, although this feature is used sparingly. Emails are destroyed 2 years after their delivery date.

We will try to answer the questions above:

  1. The information stored in our table has an inherent time component we should not ignore. We will use the deliver_at column as our partitioning key, and will have one partition per month. Data is usually going to be queried by date range, which will allow the query planner to quickly discard most of the partitions. This partitioning strategy will also allow us to drop entire partitions to remove old data.
  2. Partitions will contain one month of emails each. Since our application became more successful over the years, the number of emails scheduled per month has grown accordingly. Emails already delivered are rarely checked by our users. 90% of the emails are scheduled to be delivered within the next 30 days.
  3. Rows can be updated or removed until the delivery date.
  4. Since this project is running in production, we should avoid service disruption as much as we can. In the best scenario, there would be no downtime, although a temporary restriction of some of the features could be considered acceptable.

Experiment setup

Now that the scope of the experiment has been defined, let’s create the table subject of our experiment:

CREATE TABLE deliveries (
  id UUID NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
  deliver_at TIMESTAMP NOT NULL,
  recipient VARCHAR NOT NULL,
  body VARCHAR NOT NULL
);

a function to generate random email bodies:

CREATE FUNCTION gen_string() RETURNS varchar AS $$
SELECT
  string_agg(chr(65 + floor(random() * 26) :: integer), '')
FROM
  generate_series(1, floor(random() * 901 + 100) :: integer) 
$$ LANGUAGE SQL;

and some fake data:

INSERT INTO
  deliveries (deliver_at, recipient, body)
SELECT
  dd,
  CONCAT(gen_random_uuid(), '@example.com'),
  gen_string()
FROM
  generate_series(
    '2021-02-01' :: timestamp,
    '2025-08-29' :: timestamp,
    '10 seconds' :: interval
  ) dd;
-- 28 min 11 secs.

The command above generates one row every 10 seconds for a time span of about 4.5 years. Assuming that today was 2025–09–06, this means up to 2 years in the future. The row count and table size can be obtained as:

SELECT
  count(*),
  pg_size_pretty(pg_table_size('deliveries'))
from
  deliveries;

/*
| count     | pg_size_pretty  |
|---------- |---------------- |
| 14428801  | 9286 MB         |
*/

Planning the migration

The migration plan is, in my opinion, the most critical part of the process. It is crucial to understand what is happening in each one of the steps, and what operations are going to become a blocker for the running application. A safe plan considers that any of the steps may fail without compromising the stability of the application running.

The plan will be very different based on the nature of the data stored in the tables. The questions listed above will surely help you draft a good migration plan.

Our plan includes the following steps:

  1. Remove the primary key of the deliveries table
  2. Create a new table with partitions.
  3. Copy immutable data
  4. Copy mutable data and swap the table names

Step 1. Removing the primary key

One of the constraints that partitioned tables have is that uniqueness can only be guaranteed at the partition level. Our deliveries table has a primary key whose uniqueness we just cannot guarantee once we split the table into multiple partitions.

This should not be a big deal in most cases, especially if the application is not messing up with the id column (that is, it either generates truly unique UUIDs or lets Postgres do it). In case your table is using a sequential ID, migrating to UUIDs before partitioning the table may be a good idea. I recently wrote an article specifically on that topic.

Another key aspect needs to be taken into consideration. Partitioning such a big table by the deliver_at column will make queries very efficient only if they include a condition in the deliver_at column. Otherwise, Postgres will still need to search through every partition looking for rows matching the query. We will need to make changes in our application to help Postgres understand where it needs to look for data.

For example, finding a record by ID (only) will make Postgres slower, as it will have to search every partition. This may still be fast if there is an index on the ID of every partition, but still, we are making Postgres work more than it should. The problem becomes bigger for more complex queries that require table scans. In this case, providing the delivery timestamp as well as the ID will narrow the search to just one of the partitions.

What should be done after removing the primary key depends heavily on how your application uses the data stored in the table. Some options may include:

  • Not creating any index on the ID column if searches by ID are uncommon in the application.
  • Create an index to cover the former primary key (ID), even though uniqueness will no longer be guaranteed.
  • Define a primary key that includes the partition key, possibly from an existing unique index.

We will also need to check if there are foreign keys referencing the primary key of our table. If that is the case, we will need to either remove the foreign key constraint or maintain it using a composite primary key that includes the partition key.

In our specific case, we choose the second option. Since we don’t want to leave the table unindexed, we will create an index on the ID column first, and then drop the primary key constraint.

CREATE INDEX CONCURRENTLY ON deliveries(id);   -- 43 secs 622 msec.
ALTER TABLE
  deliveries DROP CONSTRAINT deliveries_pkey;  -- 49 msec.

Step 2. Creating the new table with partitions

This step does not involve the original deliveries table in any way, so there is no way this could block any existing process. We will first create a deliveries_new table with the same structure as deliveries, indicating it will be partitioned by the deliver_at column:

CREATE TABLE deliveries_new (LIKE deliveries INCLUDING ALL) 
  PARTITION BY RANGE(deliver_at); -- 48 msec.

Then, will create an archive partition to keep removable data (older than 2 years), and one additional partition per month.

CREATE TABLE deliveries_archive PARTITION OF deliveries_new for VALUES FROM ('-infinity'::timestamp) TO ('2021-09-01'::timestamp);
CREATE TABLE deliveries_2021_09 PARTITION OF deliveries_new FOR VALUES FROM ('2021-09-01'::timestamp) TO ('2021-10-01'::timestamp);
CREATE TABLE deliveries_2021_10 PARTITION OF deliveries_new FOR VALUES FROM ('2021-10-01'::timestamp) TO ('2021-11-01'::timestamp);
-- Rows removed for readability
CREATE TABLE deliveries_2025_06 PARTITION OF deliveries_new FOR VALUES FROM ('2025-06-01'::timestamp) TO ('2025-07-01'::timestamp);
CREATE TABLE deliveries_2025_07 PARTITION OF deliveries_new FOR VALUES FROM ('2025-07-01'::timestamp) TO ('2025-08-01'::timestamp);
CREATE TABLE deliveries_2025_08 PARTITION OF deliveries_new FOR VALUES FROM ('2025-08-01'::timestamp) TO ('2025-09-01'::timestamp);

-- 586 msec.

Step 3. Copying immutable data

The most straightforward approach would be to directly insert all rows in deliveries into deliveries_new_master. However, we would need to block updates and deletions while data is being migrated to prevent any inconsistency between the old and the new table. Otherwise, a new delivery inserted after the migration started would remain only in the old table but would be missing in the new one.

The requirements of our application allow us to migrate first a big portion of the rows that we know are not going to be updated. We will still need to block the table while updateable data is being moved, but we will have reduced the amount of rows, and consequently, the amount of time the table will be blocked.

The ratio between immutable/mutable data in our table is very relevant. If most of our records are mutable, this step is probably not a big win. However, if 95% of our records are immutable, this approach will reduce the write-block time to 1/20th, which is a lot.

Now, it’s time to finally copy immutable data. Postgres will take care of checking the deliver_at column of every row and place it in the right partition.

INSERT INTO
  deliveries_new (id, deliver_at, recipient, body)
SELECT
  id,
  deliver_at,
  recipient,
  body
FROM
  deliveries
WHERE(deliver_at <= '2023-09-05' :: timestamp); 
--2 min 11 secs.

This query takes a little bit of time to execute, but that should not be too concerning as it only acquires an AccessShareLock , which does not prevent read-and-write operations.

Step 4. Copying mutable data and swapping table names

In this step, we will need to block the parallel updates of data in the table while the migration is running. We cannot allow rows to be inserted or modified once the migration has started, and we will need to be explicit about that:

BEGIN TRANSACTION;
LOCK TABLE deliveries IN SHARE MODE;
INSERT INTO
  deliveries_new (id, deliver_at, recipient, body)
SELECT
  id,
  deliver_at,
  recipient,
  body
FROM
  deliveries
WHERE(deliver_at > '2023-09-05' :: timestamp);
ALTER TABLE deliveries RENAME TO deliveries_old;
ALTER TABLE deliveries_new RENAME TO deliveries;
COMMIT;
-- 3 min 20 secs.

During most of the execution of the script above, the connection is holding two locks on the deliveries table:

  • An AccessShareLock used to SELECT information from the table.
  • A ShareLock, explicitly acquired to prevent concurrent data modification from other processes in the application. This lock is released automatically when the transaction is committed or rolled back.

During this time, rows will still be readable so users will have most features available. Data-modification operations will be waiting until the ShareLock is released, potentially for a long time. If you really want to play safe, I would recommend:

  • Clearly communicate to your users that a maintenance window will be needed. Explain what features will be unavailable, and a time period sufficiently long to react to any unforeseen issues. The immutable/mutable data ratio and the time needed to copy the immutable data can give you a hint about how long this step should take.
  • Pick a maintenance window where write operations are less frequent.
  • Disable impacted features in your application. Users prefer seeing a banner explaining that a feature is temporarily disabled than clicking on a submit button, keeping the request open for an abnormal amount of time, finally timing out, and not even knowing if the operation succeeded.

Right at the end of the transaction, the connection will acquire an AccessExclusiveLock to rename the tables. This lock is the most aggressive, as it conflicts with any other possible lock. However, renaming a table does not involve any operation on the table data, so these two commands should happen in the blink of an eye.

Final considerations

This article is by no means a recipe that can be followed step by step in a real-world scenario. The questions analyzed in the problem statement should make it crystal clear that there is no one-size-fits-all solution to every scenario.

One of the keys to deciding this approach was the fact that a big portion of data is updateable. For example, if we were using a table to store historical information, we could have used declarative partitioning to attach the existing table as an archive partition of a new table. That would avoid copying data between tables, which is ultimately the most time-consuming and delicate part of the process.

Another key factor here is that we chose to partition by the deliver_at column since we wanted to reduce the amount of actively used data, as well as ease maintenance tasks like removing old data (now it’s as easy as dropping the old partition). However, there are some other good reasons to partition a table even if there are no stale rows we no longer need to access.

We also need to be aware that the database size will be considerably increased while we are making the migration, as we will have the content of this big table duplicated in the old and the new table. It is very important to make sure our database has enough space to duplicate the size of this table, as otherwise, we could run out of disk space in the middle of the process.

We should also do some cleanup tasks like dropping the deliveries_old table and making sure that our application is using the partitioned table properly.

Writing this article was my first approach to partitioned tables in Postgres, so please don’t take this as the word of an expert. I encourage you to replicate the experiment in your local environment and try different approaches that may work for your use case.

Happy coding!