pgBackRest postgresql backup solution

pgBackRest postgresql backup solutionpgBackRest addresses many of the must-have features that you’ll want to look for in a PostgreSQL backup solution. I have been a great fan of the pgBackRest project for quite some time, and it gets better all the time. Historically, it was written in perl and now over the last year, the project is making steady progress converting into native C code. At the time of writing, the latest version is 2.13 and there remains dependency on a long list of perl libraries. In case you’ve never tried pgBackRest, now it is a great time to do it. This post should help you to set up a simple backup with a local backup repository.

Installation

The pgBackRest project packages are maintained in the PGDG repository. If you have already added the PGDG repository to package manager,  installation is a breeze.

On RHEL/CentOS/OEL:

$ sudo yum install pgbackrest

On Ubuntu/Debian

$ sudo apt install pgbackrest

This will fetch all the required perl libraries too:

The backrest is a native executable now (version 2):

$ file /usr/bin/pgbackrest
/usr/bin/pgbackrest: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=5e3f6123d02e0013b53f6568f99409378d43ad89, not stripped

Some of the other changes DBAs should keep in mind are:

  1. thread-max option is no longer valid – use process-max instead.
  2. archive-max-mb option is no longer valid and has been replaced with the archive-push-queue-max option which has different semantics
  3. The default for the backup-user (deprecated), which is a new repo-host-user, has changed from backrest to pgbackrest.
  4. The configuration file has changed from /etc/pgbackrest.conf to /etc/pgbackrest/pgbackrest.conf

Building from source

We may want to build pgBackRest depending on our environment and version. Building pgBackrest from source on Debian / Ubuntu is already covered in the official documentation. Below I’ve provided the steps to follow for the Red Hat family.

Get the tarball of the latest release:

curl -LO https://github.com/pgbackrest/pgbackrest/archive/release/2.13.tar.gz

In order to build from source, we need to build tools and libraries:

sudo yum install gcc make zlib-devel libxml2-devel openssl-devel

Now we can unpack the tarball and move it to the unpacked the directory:

tar -xvf 2.13.tar.gz
cd pgbackrest-release-2.13

Before proceeding any further, we should make sure that perl is built with the  USE_64_BIT_INT option.

$ perl -V | grep USE_64_BIT_INT
----------output-------------
USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES

pgBackRest depends on a few perl libaries. If they are not already present in the system, we have to install them:

sudo yum install perl-devel perl-ExtUtils-Embed.noarch perl-JSON-PP.noarch perl-Compress-Raw-Zlib.x86_64 perl-DBD-Pg.x86_64

Now we can build pgBackRest by executing “make” in the src directory

cd src
make

If the build is successful we should see the pgbackrest executable in the src directory. We should move it to a standard path location like /usr/bin

sudo cp ./pgbackrest /usr/bin/

Now we can create the repository location and log directory for pgBackRest. We grant privileges on that directory to the OS user that will execute pgBackRest.

sudo mkdir /var/lib/pgbackrest
sudo chown postgres:postgres /var/lib/pgbackrest
sudo mkdir -p /var/log/pgbackrest
sudo chown postgres:postgres /var/log/pgbackrest

Configuration

The very first step is to create a configuration file  /etc/pgbackrest.conf for the stanza. A stanza defines the backup configuration for a specific PostgreSQL database cluster. Any global configuration sections can be overridden to define stanza-specific settings.

sudo mkdir /etc/pgbackrest
sudo bash -c 'cat << EOF  > /etc/pgbackrest/pgbackrest.conf
[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2
[pg0app]
pg1-path=/var/lib/pgsql/11/data
pg1-port=5432
EOF'

Here, we have specified the backup repository location to be /var/lib/pgbackrest.

Once we have defined the configuration file, we can proceed to create the stanza

$ pgbackrest stanza-create --stanza=pg0app --log-level-console=info

This will produce output like this:

2019-04-23 11:47:50.640 P00   INFO: stanza-create command begin 2.13: --log-level-console=info --pg1-path=/var/lib/pgsql/11/data --pg1-port=5432 --repo1-path=/var/lib/pgbackrest --stanza=pg0app
2019-04-23 11:47:51.166 P00   INFO: stanza-create command end: completed successfully (527ms)

Now we need to set up the database parameters to use pgbackrest. In particular, we should use pgbackrest for archive_command. One of the major concerns for using cp like utilities in archive_command is that they are lazy writers: they do not ensure that everything is written and that fsync is called. This is a potential hole in many backup configurations and pgbackrest can solve this issue. Here is a sample set of parameter changes:

ALTER SYSTEM SET wal_level = 'replica';
ALTER SYSTEM SET archive_mode = 'on';
ALTER SYSTEM SET archive_command = 'pgbackrest --stanza=pg0app archive-push %p';
ALTER SYSTEM SET max_wal_senders = '10';
ALTER SYSTEM SET hot_standby = 'on';

From version 2.12 onwards, the archive-push used in the above setting is entirely coded in C, making it more lightweight and faster to execute.

Basic backup configuration is complete now and we can check the backup setup:

$ pgbackrest check --stanza=pg0app --log-level-console=info

This should produce output like this:

2019-04-23 13:57:33.241 P00   INFO: check command begin 2.13: --log-level-console=info --pg1-path=/var/lib/pgsql/11/data --pg1-port=5432 --repo1-path=/var/lib/pgbackrest --stanza=pg0app
2019-04-23 13:57:35.184 P00   INFO: WAL segment 00000001000000000000003D successfully stored in the archive at '/var/lib/pgbackrest/archive/pg0app/11-1/0000000100000000/00000001000000000000003D-9335cf780f0e1e468de62e0cbf22e7953a84c0c4.gz'
2019-04-23 13:57:35.185 P00   INFO: check command end: completed successfully (1946ms)

Taking a full backup

Now we are all set to take a full backup. We can push a backup without specifying its type like this:

$ pgbackrest backup --stanza=pg0app --log-level-console=info

In my case, I received this output:

2019-04-23 13:58:31.912 P00   INFO: backup command begin 2.13: --log-level-console=info --pg1-path=/var/lib/pgsql/11/data --pg1-port=5432 --repo1-path=/var/lib/pgbackrest --repo1-retention-full=2 --stanza=pg0app
WARN: no prior backup exists, incr backup has been changed to full
2019-04-23 13:58:32.775 P00   INFO: execute non-exclusive pg_start_backup() with label "pgBackRest backup started at 2019-04-23 13:58:32": backup begins after the next regular checkpoint completes
2019-04-23 13:58:32.981 P00   INFO: backup start archive = 00000001000000000000003F, lsn = 0/3F000028
2019-04-23 13:58:36.712 P01   INFO: backup file /var/lib/pgsql/11/data/base/18126/26241 (128MB, 57%) checksum 40a6c0cc69e81b9aaa93977bc5e3f809ae336b79
2019-04-23 13:58:37.958 P01   INFO: backup file /var/lib/pgsql/11/data/base/18126/26259 (21.4MB, 66%) checksum c34d484c34431dcc757e48c709c9ae68e694fb13
2019-04-23 13:58:38.223 P01   INFO: backup file /var/lib/pgsql/11/data/base/18126/26272 (15.7MB, 73%) checksum fe3155a0c6b7f6c94e846c47557f5e1eca6a89ef

There are a couple of points to note here.

  1. Since we are taking a backup for the first time, it detects that there is no prior backup and switches to full database backup.
  2. The output says pg_start_backup() is issued in non-exclusive mode. pgBackRest handles the creation of label file.

At the end of the backup,  I received these details, and you should look for something similar:

2019-04-23 13:58:56.024 P01   INFO: backup file /var/lib/pgsql/11/data/base/1/13723 (0B, 100%)
2019-04-23 13:58:56.031 P01   INFO: backup file /var/lib/pgsql/11/data/base/1/13718 (0B, 100%)
2019-04-23 13:58:56.076 P00   INFO: full backup size = 224.5MB
2019-04-23 13:58:56.076 P00   INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive
2019-04-23 13:58:56.200 P00   INFO: backup stop archive = 00000001000000000000003F, lsn = 0/3F000130
2019-04-23 13:58:56.675 P00   INFO: new backup label = 20190423-135832F
2019-04-23 13:58:56.727 P00   INFO: backup command end: completed successfully (24817ms)
2019-04-23 13:58:56.728 P00   INFO: expire command begin
2019-04-23 13:58:56.737 P00   INFO: full backup total < 2 - using oldest full backup for 11-1 archive retention
2019-04-23 13:58:56.739 P00   INFO: expire command end: completed successfully (11ms)

Check the backup related information

$ pgbackrest info
stanza: pg0app
    status: ok
    cipher: none
    db (current)
        wal archive min/max (11-1): 000000010000000000000043/000000010000000000000043
        full backup: 20190423-143526F
            timestamp start/stop: 2019-04-23 14:35:26 / 2019-04-23 14:35:39
            wal start/stop: 000000010000000000000043 / 000000010000000000000043
            database size: 224.6MB, backup size: 224.6MB
            repository size: 14.1MB, repository backup size: 14.1MB

As we can see, pgBackRest has compressed the backup to 14.1MB in size. This represents a great reduction compared to the size of the original database.

If a good full backup exists, then by default pgBackRest will attempt to perform an incremental backup.

Restoring backup

Restoring the backup is as simple as a single liner. pgBackRest will find out the correct full backup/ differential backup and incremental backup and do the restore for you.

$ pgbackrest restore --stanza=pg0app --log-level-console=info

The restore will display details on screen. Towards the end of the restore you should see a few lines as follows:

2019-04-23 15:28:29.113 P01   INFO: restore file /var/lib/pgsql/11/data/base/1/13723 (0B, 100%)
2019-04-23 15:28:29.120 P01   INFO: restore file /var/lib/pgsql/11/data/base/1/13718 (0B, 100%)
2019-04-23 15:28:29.121 P00   INFO: write /var/lib/pgsql/11/data/recovery.conf
2019-04-23 15:28:29.123 P00   INFO: restore global/pg_control (performed last to ensure aborted restores cannot be started)
2019-04-23 15:28:29.127 P00   INFO: restore command end: completed successfully (22290ms)

So pgBackRest creates a recovery.conf file automatically as the log output says. pgBackRest automatically inserts the restore_command to restore archived WAL logs like:

restore_command = 'pgbackrest --log-level-console=info --stanza=pg0app archive-get %f "%p"'

As a DBA user, you can just start up PostgreSQL as usual. In my case I just needed this command:

$ sudo systemctl start postgresql-11

PostgreSQL executes the above-mentioned restore_command as it starts and this will get all the archived WALs that are needed to complete the recovery before starting up.

Summary

pgBackRest is continuously evolving and it as it matures it’s emerging as one of the best backup solutions for PostgreSQL. The new features, especially in Version 2, are impressive. It takes away the mundane tasks of backup configuration and retention policies, which is extremely helpful.

Don’t forget! At Percona Live in Austin, May 28-30 2019, we’ll have two days of PostgreSQL content in a postgres dedicated track. Please see all our PostgreSQL talks here.


Photo of elephant by Magda Ehlers from Pexels