Database - Backup and Recovery

Recovery is how a DBMS restores data to a consistent state after failures. Three types of failures exist:

Transaction failure — A transaction aborts (constraint violation, deadlock, application error). Handled by rollback using the transaction log.
System crash — Power failure, OS crash, hardware fault. The ARIES Recovery Algorithm replays committed transactions (redo) and undoes uncommitted ones (undo) from the WAL.
Media failure — Disk corruption or loss. Requires restoring from a backup and replaying the WAL from that point forward.

Backup Strategies

Strategy	Description	Recovery Time	Storage Cost
Full backup	Complete copy of the entire database	Fastest restore	Highest
Incremental	Only changes since last backup (any type)	Slower (chain of restores)	Lowest
Differential	Only changes since last full backup	Medium	Medium
Continuous archiving	WAL shipping / point-in-time recovery	Any point in time	WAL storage

PostgreSQL pg_admin Backup

Take a backup of the Database using pg admin in the custom format, in the data options select

do not save owner
do not save namespace
do not save privileges

Restore

Select the db you want to restore and choose restore, select

do not save owner
do not save namespace
do not save privileges

after sometime it should be done

[! Caution]
make sure you only restore one Database at a time or backup one Database at a time

IMPORTANT

if you have any database triggers, make sure you point to the right schema after you dump the Database

Command-Line Backup

# Logical backup (SQL dump)
pg_dump -Fc mydb > mydb.dump        # Custom format (compressed, flexible)
pg_dump -Fp mydb > mydb.sql         # Plain SQL
 
# Restore
pg_restore -d mydb mydb.dump
 
# Physical backup (file-level, for point-in-time recovery)
pg_basebackup -D /backup/dir -Ft -z -P

Backing up and Restoring between Versions

When you are making backups and restore between different versions of the postgres servers, you have to be very careful regarding this. As this might results in header error.

So this happened when i was doing a backup and restore between version 16 (local) and 15 production.

to restore this onto a 15 version you have to do the following:

make a dump from the postgres 16 remote server
make a restore of the file onto a postgres 15 version from the same backup host
make a backup file from the 15 remote server
make restore on the postgres 15 server remotely

USE IDENTICAL VERSIONS

Make sure to use the same version of postgres on development and production and test databases

Point-in-Time Recovery (PITR)

PostgreSQL supports recovering to any specific moment using continuous WAL archiving:

Take a base backup with pg_basebackup
Configure archive_command to ship WAL files to a safe location
To recover: restore the base backup, configure recovery_target_time, start PostgreSQL — it replays WAL up to that exact moment

This is the production-grade recovery strategy for any serious deployment.

Recovery in Other Systems

MySQL / InnoDB — Uses redo log (WAL) + undo log + binary log. mysqlbackup or mysqldump for backups. Binary log replay for point-in-time recovery.
SQLite — Rollback journal (default) or WAL mode. The entire database is a single file — backup is a file copy (use .backup command or SQLite Online Backup API for consistency).

Recovery in Distributed Databases

Distributed environments add complexity because failures can be partial (one node crashes while others continue):

Replication-Based Recovery

Strategy	How It Works	Failover Time
Active replication	All replicas process transactions concurrently	Immediate — any replica can take over
Passive replication	Primary processes transactions, replicates to standby	Brief delay — standby must catch up from replication log

Distributed Transaction Recovery

When a transaction spans multiple nodes, recovery must coordinate across all of them:

Two-Phase Commit (2PC) ensures all nodes either commit or roll back
Transaction logs at each site enable local redo/undo
Checkpointing across sites reduces log replay time (see ARIES Recovery Algorithm)

Data Synchronization After Recovery

After a failed node comes back online:

Log replay — Apply all missed WAL entries from the healthy replica
Snapshot + incremental — Ship a full snapshot to the recovered node, then replay only changes since the snapshot
Anti-entropy — Background process comparing data between replicas and resolving differences (used by Cassandra, DynamoDB)

Quartz 5

Explorer

Database - Backup and Recovery

Database - Backup and Recovery

Backup Strategies

PostgreSQL pg_admin Backup

Restore

Command-Line Backup

Backing up and Restoring between Versions

Point-in-Time Recovery (PITR)

Recovery in Other Systems

Recovery in Distributed Databases

Replication-Based Recovery

Distributed Transaction Recovery

Data Synchronization After Recovery

Graph View

Table of Contents