Technology

PostgreSQL manual partitioning

Have you ever wondered how many tables can we create and use in PostgreSQL database server? Shall we call them partitions or shards? Why not to use built-in “automatic” partitioning?

Partitions or shards?

Lets first define the difference between partitions and shards. Partitions are placed on the same server, but shards can be spread across various machines. We can use inheritance or more recent “automatic” partitioning. However both of these solutions lead to tight join with PostgreSQL RDBMS, which in some situations we would like to avoid. Imagine a perspective of migrating our schemas to different RDBMS like Microsoft SQL Server. Not using any vendor-specific syntax and configuration would be beneficial.

Vendor agnostic partitions

So instead, we can just try to create partition-like tables manually:

sudo apt install postgresql -y
sudo -u postgres -i
psql
CREATE DATABASE paritions
exit

Then, after installing PostgreSQL and creating new database:

for i in `seq 1 10000`; 
do 
  echo $i;
  psql -c "create table demo_$i (id int, val int);" partitions; 
done

This way we created 10 000 tables with just generic SQL syntax, which is 100% compatible with all other RDBMS. What is more important we do not rely on shared memory configuration and limits coming from attaching too many partitions into main table.

How many regular partitions can I have?

In case of PostgreSQL (regular partitions) if we attach too many tables, we can easily start negatively notice it in terms of performances and memory consumption. So if you would like to use PostgreSQL “automatic” partitioning keep in mind not to attach too many tables. How many is too many? I started noticing it after attaching just 100 – 200 tables, which is small/medium deployments should be our highest number.

How big my data can be?

In terms of how big our PostgreSQL single node can be I would say that 5 – 10 TB of data with tables reaching (including toasts) 2 TB is fairly normal situation and regular hardware will handle it. If you have 512 GB of RAM on the serve, then buffer and cache will be sufficient to operate on such huge tables.

How many tables can I create in single node?

As mentioned before, you are restricted by storage, memory and CPU – as always. However you should also monitor inodes count as well as file descriptors count in the system, because this separate tables might be put in different files and it is more important if we place in records some lenghty data which go into toasts. However, using regular tables as partitions is the most denormalized way of achieving goal of dividing our data physically.

I can tell that 50 000 tables in a single node is just fine even on small/mid system.

But, what is the actual limit? I think the only practical limit comes from hardware and operating system constraints. On Ubuntu 22 LXC container, 8GB drive, 1 vCPU, 512 MB of memory we have 524k inodes available. After adding 50k tables we can see that inodes increased up to 77126 entries which is 15% total available.

postgres@z10-test:~$ df -i
Filesystem                        Inodes IUsed   IFree IUse% Mounted on
/dev/mapper/pve-vm--131--disk--0  524288 77126  447162   15% /
none                             4070043    25 4070018    1% /dev
tmpfs                            4070043     2 4070041    1% /dev/shm
tmpfs                             819200   149  819051    1% /run
tmpfs                            4070043     2 4070041    1% /run/lock
tmpfs                             814008    11  813997    1% /run/user/110

I think that at least fom inodes perspective we are on good side, even with 50k tables.

How to design data architecture then?

Now lets image real world scenario of a system comprising of 1 million customers. With that number I would recommend having multiple nodes in different locations to decrease latency (in case of global services). Application architecture would also require to be distributed. So we would use both partitioning within one node and sharding “per se”. On the other hand, we may stick with single shard with all customers in a single node but actual work to be done within Cassandra nodes and not RDBMS….