Testing your PostgreSQL backups: a practical guide

Audience:
Topic:

It is often said that "an untested backup is not a backup" but how can we turn that saying into something more actionable? And in an organization with limited engineering bandwidth, what is the most important thing to prioritize?

In this presentation, I aim to provide a practical guide on what I believe will give the greatest "bang for buck" with backup testing, and how to fold it into your operations, modeled after how we test our PostgreSQL backups at Academia.edu.

This talk covers some important background concepts on how to define measurable goals for your backups, as well as just enough background on how backups work with postgres to know what to monitor:

  • Recovery Point Objective (RPO) and Recovery Time Objective (RTO)
  • Point-in-time recovery
  • Physical vs logical backups: what they are, how they are different, and what problems they solve
  • What data is included in a "physical" postgres backup, and how does the Write Ahead Log (WAL) come into play?

In addition, we will go through:

  • What are reasonable goals to set for RPO and RTO and how do we measure them?
  • How you can couple the testing of your backups with your operational practices
  • How to monitor and forecast costs, with a worked example for Amazon S3
  • What additional monitoring to implement, with a worked example for pgBackRest and Datadog
Room:
Ballroom B
Time:
Friday, March 15, 2024 - 15:45 to 16:45