My First Container-based Postgres Upgrade

Yesterday I did my first container-based PostgreSQL version upgrade. In my case the upgrade was from version 13 to 14. In hindsight I was quite naïve. 😅

I was always wondering why distros kept separate data directories for different versions … now I know: you can’t do in-place upgrades with PostgreSQL. You need to have separate data directories as well as both version’s binaries. 😵 Distros have their mechanisms for it, but in the container world you’re kind of on your own.

Well not really … it’s just different. I found there’s a project that specializes in exactly the tooling part of the upgrade. After a little trial an error (see below) it went quite smoothly.

Procedure

In the end it came down to the following steps:

Stop the old postgres container.
Backup the old data directory (yay ZFS snapshots).
Create the new postgres container (with a new data directory; in my case via Ansible)
Stop the new postgres container.
Run the upgrade. (see command below)
Start the new postgres container.
Run vacuumdb as suggested at the end of the upgrade. (see command below)

The Upgrade Command

I used the tianon/postgres-upgrade container for the upgrade. Since my directory layout didn’t follow the “default” structure I had to mount each version’s data directory separately.

docker run --rm \
-e POSTGRES_INITDB_ARGS="--no-locale --encoding=UTF8" \
-v /tmp/pg_upgrade:/var/lib/postgresql \
-v /tank/containers/postgres-13:/var/lib/postgresql/13/data \
-v /tank/containers/postgres-14:/var/lib/postgresql/14/data \
tianon/postgres-upgrade:13-to-14

I set the POSTGRES_INITDB_ARGS to what I used when creating the new Postgres container’s data directory. This shouldn’t be necessary because we let the new Postgres container initialize the data directory. (see below) I left it in just to be safe. 🤷

I explicitly mounted something to the container’s /var/lib/postgresql directory in order to have access to the upgrade logs which are mentioned in error messages. (see below)

The Vacuumdb Command

Upgrading finishes with a suggestion like:

Upgrade Complete
—————-
Optimizer statistics are not transferred by pg_upgrade.
Once you start the new server, consider running:
/usr/lib/postgresql/14/bin/vacuumdb –all –analyze-in-stages

We can run the command in the new Postgres container:

docker exec postgres vacuumdb -U postgres --all --analyze-in-stages

We use the postgres user, because we didn’t specify a POSTGRES_USER when creating the database container.

Pitfalls

When you’re not using the default directory structure there’re some pitfalls. Mounting the two versions’ data directories separately is easy enough … it says so in the README. It’s what it doesn’t say that makes it more difficult than necessary. 😞

Errors When Initializing the New Data Directory

The first error I encountered was that the new data directory would get initialized with the default initdb options. where I used an ~~optimized~~ cargo-culted incantation which was incompatible (in my case --no-locale --encoding=UTF8). The upgrade failed with the following error:

lc_collate values for database “postgres” do not match: old “C”, new “en_US.utf8”

So I made sure I created the new database container (with the correct initdb args) before the migration fixed this.

Extra Mounts for the Upgrade

What tripped me really up was that when something failed it said to look into a specific log file which I couldn’t find. 🤨 I had to also mount something to the /var/lib/postgres directory which then had all the upgrade log files. 😔

This also solved another of my problems where the upgrade tool wanted to start an instance of the Postgres database, but failed because it couldn’t find a specific socket … which also happens to be located in the directory mentioned above.

Authentication Errors After Upgrade

After the upgrade I had a lot of authentication errors although non of the passwords should have changed.

FATAL: password authentication failed for user “nextcloud”

After digging through the internet and comparing both the old and new data directories it looked like the password hashing method changed. It changed from md5 to scram-sha-256 (in pg_ hda.conf the line saying host all all all scram-sha-256). 😑Just re-setting (i.e. setting the same passwords again) via ALTER ROLE foo SET PASSWORD '...'; on all users fixed the issue.🤐