pgBackRest addresses many of the must-have features that you’ll want to look for in a PostgreSQL backup solution. I have been a great fan of the pgBackRest project for quite some time, and it gets better all the time. Historically, it was written in perl and now over the last year, the project is making steady progress converting into native C code. At the time of writing, the latest version is 2.13 and there remains dependency on a long list of perl libraries. In case you’ve never tried pgBackRest, now it is a great time to do it. This post should help you to set up a simple backup with a local backup repository.
The pgBackRest project packages are maintained in the PGDG repository. If you have already added the PGDG repository to package manager, installation is a breeze.
$ sudo yum install pgbackrest
$ sudo apt install pgbackrest
This will fetch all the required perl libraries too:
The backrest is a native executable now (version 2):
$ file /usr/bin/pgbackrest /usr/bin/pgbackrest: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=5e3f6123d02e0013b53f6568f99409378d43ad89, not stripped
Some of the other changes DBAs should keep in mind are:
- thread-max option is no longer valid – use process-max instead.
- archive-max-mb option is no longer valid and has been replaced with the archive-push-queue-max option which has different semantics
- The default for the backup-user (deprecated), which is a new repo-host-user, has changed from backrest to pgbackrest.
- The configuration file has changed from /etc/pgbackrest.conf to /etc/pgbackrest/pgbackrest.conf
Building from source
We may want to build pgBackRest depending on our environment and version. Building pgBackrest from source on Debian / Ubuntu is already covered in the official documentation. Below I’ve provided the steps to follow for the Red Hat family.
Get the tarball of the latest release:
curl -LO https://github.com/pgbackrest/pgbackrest/archive/release/2.13.tar.gz
In order to build from source, we need to build tools and libraries:
sudo yum install gcc make zlib-devel libxml2-devel openssl-devel
Now we can unpack the tarball and move it to the unpacked the directory:
tar -xvf 2.13.tar.gz cd pgbackrest-release-2.13
Before proceeding any further, we should make sure that perl is built with the USE_64_BIT_INT option.
$ perl -V | grep USE_64_BIT_INT ----------output------------- USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
pgBackRest depends on a few perl libaries. If they are not already present in the system, we have to install them:
sudo yum install perl-devel perl-ExtUtils-Embed.noarch perl-JSON-PP.noarch perl-Compress-Raw-Zlib.x86_64 perl-DBD-Pg.x86_64
Now we can build pgBackRest by executing “make” in the src directory
cd src make
If the build is successful we should see the pgbackrest executable in the src directory. We should move it to a standard path location like /usr/bin
sudo cp ./pgbackrest /usr/bin/
Now we can create the repository location and log directory for pgBackRest. We grant privileges on that directory to the OS user that will execute pgBackRest.
sudo mkdir /var/lib/pgbackrest sudo chown postgres:postgres /var/lib/pgbackrest sudo mkdir -p /var/log/pgbackrest sudo chown postgres:postgres /var/log/pgbackrest
The very first step is to create a configuration file /etc/pgbackrest.conf for the stanza. A stanza defines the backup configuration for a specific PostgreSQL database cluster. Any global configuration sections can be overridden to define stanza-specific settings.
sudo mkdir /etc/pgbackrest sudo bash -c 'cat << EOF > /etc/pgbackrest/pgbackrest.conf [global] repo1-path=/var/lib/pgbackrest repo1-retention-full=2 [pg0app] pg1-path=/var/lib/pgsql/11/data pg1-port=5432 EOF'
Here, we have specified the backup repository location to be /var/lib/pgbackrest.
Once we have defined the configuration file, we can proceed to create the stanza
$ pgbackrest stanza-create --stanza=pg0app --log-level-console=info
This will produce output like this:
2019-04-23 11:47:50.640 P00 INFO: stanza-create command begin 2.13: --log-level-console=info --pg1-path=/var/lib/pgsql/11/data --pg1-port=5432 --repo1-path=/var/lib/pgbackrest --stanza=pg0app 2019-04-23 11:47:51.166 P00 INFO: stanza-create command end: completed successfully (527ms)
Now we need to set up the database parameters to use pgbackrest. In particular, we should use pgbackrest for archive_command. One of the major concerns for using cp like utilities in archive_command is that they are lazy writers: they do not ensure that everything is written and that fsync is called. This is a potential hole in many backup configurations and pgbackrest can solve this issue. Here is a sample set of parameter changes:
ALTER SYSTEM SET wal_level = 'replica'; ALTER SYSTEM SET archive_mode = 'on'; ALTER SYSTEM SET archive_command = 'pgbackrest --stanza=pg0app archive-push %p'; ALTER SYSTEM SET max_wal_senders = '10'; ALTER SYSTEM SET hot_standby = 'on';
From version 2.12 onwards, the archive-push used in the above setting is entirely coded in C, making it more lightweight and faster to execute.
Basic backup configuration is complete now and we can check the backup setup:
$ pgbackrest check --stanza=pg0app --log-level-console=info
This should produce output like this:
2019-04-23 13:57:33.241 P00 INFO: check command begin 2.13: --log-level-console=info --pg1-path=/var/lib/pgsql/11/data --pg1-port=5432 --repo1-path=/var/lib/pgbackrest --stanza=pg0app 2019-04-23 13:57:35.184 P00 INFO: WAL segment 00000001000000000000003D successfully stored in the archive at '/var/lib/pgbackrest/archive/pg0app/11-1/0000000100000000/00000001000000000000003D-9335cf780f0e1e468de62e0cbf22e7953a84c0c4.gz' 2019-04-23 13:57:35.185 P00 INFO: check command end: completed successfully (1946ms)
Taking a full backup
Now we are all set to take a full backup. We can push a backup without specifying its type like this:
$ pgbackrest backup --stanza=pg0app --log-level-console=info
In my case, I received this output:
2019-04-23 13:58:31.912 P00 INFO: backup command begin 2.13: --log-level-console=info --pg1-path=/var/lib/pgsql/11/data --pg1-port=5432 --repo1-path=/var/lib/pgbackrest --repo1-retention-full=2 --stanza=pg0app WARN: no prior backup exists, incr backup has been changed to full 2019-04-23 13:58:32.775 P00 INFO: execute non-exclusive pg_start_backup() with label "pgBackRest backup started at 2019-04-23 13:58:32": backup begins after the next regular checkpoint completes 2019-04-23 13:58:32.981 P00 INFO: backup start archive = 00000001000000000000003F, lsn = 0/3F000028 2019-04-23 13:58:36.712 P01 INFO: backup file /var/lib/pgsql/11/data/base/18126/26241 (128MB, 57%) checksum 40a6c0cc69e81b9aaa93977bc5e3f809ae336b79 2019-04-23 13:58:37.958 P01 INFO: backup file /var/lib/pgsql/11/data/base/18126/26259 (21.4MB, 66%) checksum c34d484c34431dcc757e48c709c9ae68e694fb13 2019-04-23 13:58:38.223 P01 INFO: backup file /var/lib/pgsql/11/data/base/18126/26272 (15.7MB, 73%) checksum fe3155a0c6b7f6c94e846c47557f5e1eca6a89ef
There are a couple of points to note here.
- Since we are taking a backup for the first time, it detects that there is no prior backup and switches to full database backup.
- The output says pg_start_backup() is issued in non-exclusive mode. pgBackRest handles the creation of label file.
At the end of the backup, I received these details, and you should look for something similar:
2019-04-23 13:58:56.024 P01 INFO: backup file /var/lib/pgsql/11/data/base/1/13723 (0B, 100%) 2019-04-23 13:58:56.031 P01 INFO: backup file /var/lib/pgsql/11/data/base/1/13718 (0B, 100%) 2019-04-23 13:58:56.076 P00 INFO: full backup size = 224.5MB 2019-04-23 13:58:56.076 P00 INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive 2019-04-23 13:58:56.200 P00 INFO: backup stop archive = 00000001000000000000003F, lsn = 0/3F000130 2019-04-23 13:58:56.675 P00 INFO: new backup label = 20190423-135832F 2019-04-23 13:58:56.727 P00 INFO: backup command end: completed successfully (24817ms) 2019-04-23 13:58:56.728 P00 INFO: expire command begin 2019-04-23 13:58:56.737 P00 INFO: full backup total < 2 - using oldest full backup for 11-1 archive retention 2019-04-23 13:58:56.739 P00 INFO: expire command end: completed successfully (11ms)
Check the backup related information
$ pgbackrest info stanza: pg0app status: ok cipher: none db (current) wal archive min/max (11-1): 000000010000000000000043/000000010000000000000043 full backup: 20190423-143526F timestamp start/stop: 2019-04-23 14:35:26 / 2019-04-23 14:35:39 wal start/stop: 000000010000000000000043 / 000000010000000000000043 database size: 224.6MB, backup size: 224.6MB repository size: 14.1MB, repository backup size: 14.1MB
As we can see, pgBackRest has compressed the backup to 14.1MB in size. This represents a great reduction compared to the size of the original database.
If a good full backup exists, then by default pgBackRest will attempt to perform an incremental backup.
Restoring the backup is as simple as a single liner. pgBackRest will find out the correct full backup/ differential backup and incremental backup and do the restore for you.
$ pgbackrest restore --stanza=pg0app --log-level-console=info
The restore will display details on screen. Towards the end of the restore you should see a few lines as follows:
2019-04-23 15:28:29.113 P01 INFO: restore file /var/lib/pgsql/11/data/base/1/13723 (0B, 100%) 2019-04-23 15:28:29.120 P01 INFO: restore file /var/lib/pgsql/11/data/base/1/13718 (0B, 100%) 2019-04-23 15:28:29.121 P00 INFO: write /var/lib/pgsql/11/data/recovery.conf 2019-04-23 15:28:29.123 P00 INFO: restore global/pg_control (performed last to ensure aborted restores cannot be started) 2019-04-23 15:28:29.127 P00 INFO: restore command end: completed successfully (22290ms)
So pgBackRest creates a recovery.conf file automatically as the log output says. pgBackRest automatically inserts the restore_command to restore archived WAL logs like:
restore_command = 'pgbackrest --log-level-console=info --stanza=pg0app archive-get %f "%p"'
As a DBA user, you can just start up PostgreSQL as usual. In my case I just needed this command:
$ sudo systemctl start postgresql-11
PostgreSQL executes the above-mentioned restore_command as it starts and this will get all the archived WALs that are needed to complete the recovery before starting up.
pgBackRest is continuously evolving and it as it matures it’s emerging as one of the best backup solutions for PostgreSQL. The new features, especially in Version 2, are impressive. It takes away the mundane tasks of backup configuration and retention policies, which is extremely helpful.
Don’t forget! At Percona Live in Austin, May 28-30 2019, we’ll have two days of PostgreSQL content in a postgres dedicated track. Please see all our PostgreSQL talks here.