0
点赞
收藏
分享

微信扫一扫

Barman备份恢复迁移——Before you start

王老师说 2022-03-11 阅读 50

Before you start using Barman, it is fundamental that you get familiar with PostgreSQL and the concepts around physical backups, Point-In-Time-Recovery and replication, such as base backups, WAL archiving, etc. Below you can find a non exhaustive list of resources that we recommend for you to read: 在开始使用 Barman 之前,您必须熟悉 PostgreSQL 以及有关物理备份、时间点恢复和复制的概念,例如基本备份、WAL 归档等。下面是一个非详尽的列表 我们推荐您阅读的资源:
• PostgreSQL documentation:
– SQL Dump
– File System Level Backup
– Continuous Archiving and Point-in-Time Recovery (PITR)
– Reliability and the Write-Ahead Log
• Book: PostgreSQL 10 Administration Cookbook

Professional training on these topics is another effective way of learning these concepts. At any time of the year you can find many courses available all over the world, delivered by PostgreSQL companies such as EnterpriseDB.

Where to install Barman

Barman 的基础之一是能够通过网络从数据库服务器远程操作。从理论上讲,您可以将 Barman 服务器放置在世界其他地方的数据中心,距离您的 PostgreSQL 服务器数千英里。 实际上,您不希望 Barman 服务器离 PostgreSQL 服务器太远,这样备份和恢复时间都在可控范围内。One of the foundations of Barman is the ability to operate remotely from the database server, via the network. Theoretically, you could have your Barman server located in a data centre in another part of the world, thousands of miles away from your PostgreSQL server. Realistically, you do not want your Barman server to be too far from your PostgreSQL server, so that both backup and recovery times are kept under control.

尽管没有“一刀切”的方式来设置 Barman,但我们建议您遵守一些建议,特别是:Even though there is no “one size fits all” way to setup Barman, there are a couple of recommendations that we suggest you abide by, in particular:
• Install Barman on a dedicated server
• Do not share the same storage with your PostgreSQL server
• Integrate Barman with your monitoring infrastructure
• Test everything before you deploy it to production
开始为灾难恢复架构建模的合理方法是:A reasonable way to start modelling your disaster recovery architecture is to:
• design a couple of possible architectures in respect to PostgreSQL and Barman, such as:

  1. same data centre
  2. different data centre in the same metropolitan area
  3. different data centre
    • elaborate the pros and the cons of each hypothesis 详细说明每个假设的利弊
    • evaluate the single points of failure (SPOF) of your system, with cost-benefit analysis 通过成本效益分析评估系统的单点故障 (SPOF)
    • make your decision and implement the initial solution

Having said this, a very common setup for Barman is to be installed in the same data centre where your PostgreSQL servers are. In this case, the single point of failure is the data centre. Fortunately, the impact of such a SPOF can be alleviated thanks to two features that Barman provides to increase the number of backup tiers:

  1. geographical redundancy (introduced in Barman 2.6) 地域冗余(在 Barman 2.6 中引入)
  2. hook scripts
    话虽如此,Barman 的一个非常常见的设置是安装在 PostgreSQL 服务器所在的同一个数据中心。 在这种情况下,单点故障是数据中心。 幸运的是,由于 Barman 为增加备份层的数量而提供的两个功能,可以减轻这种 SPOF 的影响:

在这里插入图片描述
通过地理冗余,您可以依靠位于不同数据中心/可用区的 Barman 实例来同步源 Barman 服务器的全部内容。还有更多:鉴于在 Barman 中不仅可以在全局级别配置地理冗余,还可以在服务器级别配置,您可以创建 Barman 的混合安装,其中一些服务器直接连接到本地 PostgreSQL 服务器,而其他服务器则备份子集不同的 Barman 安装(跨站点备份)。下面的图 1 显示了两个可用区(一个在欧洲,一个在美国),每个都有一个主 PostgreSQL 服务器,该服务器在本地 Barman 安装中备份,并在另一个 Barman 服务器(定义为被动)上中继以实现多层通过 rsync/SSH 备份。有关异地冗余的更多信息,请参见特定部分。
多亏了钩子脚本,Barman 的备份可以导出到不同的媒体上,例如通过 tar 的磁带,或位置,例如 Amazon 云中的 S3 存储桶。请记住,没有决定是永远的。您可以以这种方式开始,并随着时间的推移适应最适合您的解决方案。但是,请尽量保持简单。
With geographical redundancy, you can rely on a Barman instance that is located in a different data centre/availability zone to synchronise the entire content of the source Barman server. There’s more: given that geo-redundancy can be configured in Barman not only at global level, but also at server level, you can create hybrid installations of Barman where some servers are directly connected to the local PostgreSQL servers, and others are backing up subsets of different Barman installations (cross-site backup). Figure 1 below shows two availability zones (one in Europe and one in the US), each with a primary PostgreSQL server that is backed up in a local Barman installation, and relayed on the other Barman server (defined as passive) for multi-tier backup via rsync/SSH. Further information on geo-redundancy is available in the specific section.
Thanks to hook scripts instead, backups of Barman can be exported on different media, such as tape via tar, or locations, like an S3 bucket in the Amazon cloud. Remember that no decision is forever. You can start this way and adapt over time to the solution that suits you best. However, try and keep it simple to start with.

One Barman, many PostgreSQL servers

Barman 首次引入的另一个相关功能是支持多个服务器。 Barman 可以集中存储来自多个 PostgreSQL 实例(甚至不同版本)的备份数据。因此,您可以对复杂的灾难恢复架构进行建模,形成“星型模式”,其中 PostgreSQL 服务器围绕中央 Barman 服务器旋转。
Another relevant feature that was first introduced by Barman is support for multiple servers. Barman can store backup data coming from multiple PostgreSQL instances, even with different versions, in a centralised way As a result, you can model complex disaster recovery architectures, forming a “star schema”, where PostgreSQL servers rotate around a central Barman server.
每种架构都以自己的方式有意义。 根据真实的实验和测试,选择与您产生共鸣的那个,最重要的是,选择您信任的那个。 从现在开始,为了简单起见,本指南将假定一个基本架构:
• 一个 PostgreSQL 实例(主机名为 pg)
• 一台带 Barman 的备份服务器(带有主机名备份)
Every architecture makes sense in its own way. Choose the one that resonates with you, and most
importantly, the one you trust, based on real experimentation and testing.
From this point forward, for the sake of simplicity, this guide will assume a basic architecture:
• one PostgreSQL instance (with host name pg)
• one backup server with Barman (with host name backup)

Streaming backup vs rsync/SSH

传统上,Barman 总是通过 SSH 远程操作,利用 rsync 进行物理备份操作。 2.0 版通过 pg_basebackup 引入了对 PostgreSQL 流复制协议的本地支持,用于备份操作。选择这两种方法之一是你的决定。
一般来说,从 Barman 2.0 开始,通过流复制备份是 PostgreSQL 9.4 或更高版本的推荐设置。此外,如果您不使用表空间,则可以从 PostgreSQL 9.2 开始使用流式备份。
Traditionally, Barman has always operated remotely via SSH, taking advantage of rsync for physical
backup operations. Version 2.0 introduces native support for PostgreSQL’s streaming replication protocol
for backup operations, via pg_basebackup. 4
Choosing one of these two methods is a decision you will need to make.
On a general basis, starting from Barman 2.0, backup over streaming replication is the recommended
setup for PostgreSQL 9.4 or higher. Moreover, if you do not make use of tablespaces, backup over
streaming can be used starting from PostgreSQL 9.2.
重要提示:由于 Barman 透明地使用 pg_basebackup,增量备份、并行备份、重复数据删除和网络压缩等功能目前不可用。在这种情况下,带宽限制有一些限制 - 与通过 rsync 的传统方法相比。通过 rsync/SSH 的传统备份可用于从 8.3 开始的所有 PostgreSQL 版本,并且建议在所有出现 pg_basebackup 限制的情况下(例如,可以从增量备份和重复数据删除中受益的非常大的数据库)。
IMPORTANT:
Because Barman transparently makes use of pg_basebackup, features such as incremental
backup, parallel backup, deduplication, and network compression are currently not available.
In this case, bandwidth limitation has some restrictions - compared to the traditional method
via rsync.
Traditional backup via rsync/SSH is available for all versions of PostgreSQL starting from 8.3, and it is
recommended in all cases where pg_basebackup limitations occur (for example, a very large database
that can benefit from incremental backup and deduplication).
我们推荐流式备份的原因是,根据我们的经验,它比传统备份更容易设置。此外,流式备份允许您在 Windows 上备份 PostgreSQL 服务器,并在使用 Docker 时让生活更轻松。
The reason why we recommend streaming backup is that, based on our experience, it is easier to setup
than the traditional one. Also, streaming backup allows you to backup a PostgreSQL server on Windows5,
and makes life easier when working with Docker.

Standard archiving, WAL streaming … or both

PostgreSQL 的时间点恢复要求事务日志(也称为 xlog 或 WAL 文件)与基本备份一起存储。传统上,Barman 通过 PostgreSQL 的 archive_command 支持标准 WAL 文件传送(通常通过 rsync/SSH,现在通过 barman-cli 包中的 barman-wal-archive)。使用这种方法,只有在 PostgreSQL 切换到新的 WAL 文件时才会归档 WAL 文件。为简单起见,这通常发生在每 16MB 的数据更改中。
PostgreSQL’s Point-In-Time-Recovery requires that transactional logs, also known as xlog or WAL files, are stored alongside of base backups. Traditionally, Barman has supported standard WAL file shipping through PostgreSQL’s archive_command (usually via rsync/SSH, now via barman-wal-archive from the barman-cli package). With this method, WAL files are archived only when PostgreSQL switches to a new WAL file. To keep it simple, this normally happens every 16MB worth of data changes.

Barman 1.6.0 通过 pg_receivewal(在 PostgreSQL 10 之前也称为 pg_receivexlog)为 PostgreSQL 服务器 9.2 或更高版本引入了 WAL 文件流,作为事务日志归档的附加方法。 WAL 流能够降低数据丢失的风险,将 RPO 降低到接近零的值。
Barman 1.6.0 introduces streaming of WAL files for PostgreSQL servers 9.2 or higher, as an additional method for transactional log archiving, through pg_receivewal (also known as pg_receivexlog before PostgreSQL 10). WAL streaming is able to reduce the risk of data loss, bringing RPO down to near zero values.

Barman 2.0 引入了对 PostgreSQL 服务器 9.4 或更高版本的复制槽的支持,因此允许 WAL 仅流式配置。此外,您现在可以在 PostgreSQL 9.5(或更高版本)集群中添加 Barman 作为同步 WAL 接收器,并实现零数据丢失(RPO=0)。在某些情况下,您别无选择,不得不使用传统归档。在其他情况下,您可以选择是使用两者还是只使用 WAL 流。除非您有充分的理由不这样做,否则我们建议同时使用这两个通道,以获得最大的可靠性和稳健性。
Barman 2.0 introduces support for replication slots with PostgreSQL servers 9.4 or above, therefore allowing WAL streaming-only configurations. Moreover, you can now add Barman as a synchronous WAL receiver in your PostgreSQL 9.5 (or higher) cluster, and achieve zero data loss (RPO=0).
In some cases you have no choice and you are forced to use traditional archiving. In others, you can choose whether to use both or just WAL streaming. Unless you have strong reasons not to do it, we recommend to use both channels, for maximum reliability and robustness.

Two typical scenarios for backups

为了让您的生活更轻松,下面我们总结了 Barman 中给定 PostgreSQL 服务器的两个最典型的场景。请记住,这是您必须为决定使用 Barman 备份的每台服务器做出的决定。这意味着您可以在同一安装中进行异构设置。如前所述,我们只关心 PostgreSQL 服务器(pg)和 Barman 服务器(备份)。然而,在现实生活中,您的架构很可能包含其他技术,例如 repmgr、pgBouncer、Nagios/Icinga 等。
In order to make life easier for you, below we summarise the two most typical scenarios for a given
PostgreSQL server in Barman. Bear in mind that this is a decision that you must make for every single server that you decide to back up with Barman. This means that you can have heterogeneous setups within the same installation. As mentioned before, we will only worry about the PostgreSQL server (pg) and the Barman server (backup). However, in real life, your architecture will most likely contain other technologies such as repmgr, pgBouncer, Nagios/Icinga, and so on.

场景 1:通过流式协议备份 如果您使用的是 PostgreSQL 9.4 或更高版本,并且您的数据库属于一般用例场景,您最终可能会决定安装流式备份 - 参见下面的图 2。在这种情况下,您将需要配置:

  1. 与 PostgreSQL 的标准连接,用于管理、协调和监控目的
  2. pg_basebackup(用于基本备份操作)和 pg_receivewal(用于 WAL 流)将使用的流复制连接
    Scenario 1: Backup via streaming protocol
    If you are using PostgreSQL 9.4 or higher, and your database falls under a general use case scenario, you will likely end up deciding on a streaming backup installation - see figure 2 below. In this scenario, you will need to configure:
  3. a standard connection to PostgreSQL, for management, coordination, and monitoring purposes
  4. a streaming replication connection that will be used by both pg_basebackup (for base backup operations) and pg_receivewal (for WAL streaming)
    在这里插入图片描述
    在 Barman 的术语中,这种设置被称为仅流式设置,因为它不需要任何 SSH 连接来进行备份和归档操作。 这对于 Docker 环境特别适用且极其实用。但是,如前所述,您也可以配置标准归档并实现更健壮的架构 - 参见下面的图 3。
    This setup, in Barman’s terminology, is known as streaming-only setup, as it does not require any SSH connection for backup and archiving operations. This is particularly suitable and extremely practical for Docker environments.
    However, as mentioned before, you can configure standard archiving as well and implement a more robust architecture - see figure 3 below.
    在这里插入图片描述
    这种替代方法需要:
    • 一个额外的 SSH 连接,允许 PostgreSQL 服务器上的 postgres 用户以 Barman 服务器上的 barman 用户身份连接
    • PostgreSQL 中的archive_command 被配置为将WAL 文件发送到Barman 该架构也适用于不使用表空间的PostgreSQL 9.2/9.3 用户。
    This alternate approach requires:
    • an additional SSH connection that allows the postgres user on the PostgreSQL server to connect
    as barman user on the Barman server
    • the archive_command in PostgreSQL be configured to ship WAL files to Barman
    This architecture is available also to PostgreSQL 9.2/9.3 users that do not use tablespaces.

场景 2:通过 rsync/SSH 备份
传统的 rsync over SSH 设置是唯一可用的选项:
• PostgreSQL 服务器版本 8.3、8.4、9.0 或 9.1
• 使用表空间的 PostgreSQL 服务器版本 9.2 或 9.3
• 增量备份、并行备份和重复数据删除
• 备份期间的网络压缩
• 更好地控制带宽使用,包括基于表空间
在这种情况下,您将需要配置:

  1. 用于管理、协调和监控目的的 PostgreSQL 标准连接
  2. 用于 rsync 的基本备份操作的 SSH 连接,允许 Barman 服务器上的 barman 用户作为 PostgreSQL 服务器上的 postgres 用户连接
  3. 用于 WAL 归档的 SSH 连接,供 PostgreSQL 中的 archive_command 使用,并允许 PostgreSQL 服务器上的 postgres 用户以 Barman 服务器上的Barman用户身份连接
    Scenario 2: Backup via rsync/SSH
    The traditional setup of rsync over SSH is the only available option for:
    • PostgreSQL servers version 8.3, 8.4, 9.0 or 9.1
    • PostgreSQL servers version 9.2 or 9.3 that are using tablespaces
    • incremental backup, parallel backup and deduplication
    • network compression during backups
    • finer control of bandwidth usage, including on a tablespace basis
    In this scenario, you will need to configure:
  4. a standard connection to PostgreSQL for management, coordination, and monitoring purposes
  5. an SSH connection for base backup operations to be used by rsync that allows the barman user on the Barman server to connect as postgres user on the PostgreSQL server
  6. an SSH connection for WAL archiving to be used by the archive_command in PostgreSQL and that allows the postgres user on the PostgreSQL server to connect as barman user on the Barman server
    在这里插入图片描述
    从 PostgreSQL 9.2 开始,您可以添加用于 WAL 流式传输的流式复制连接,并显着降低 RPO。 图 5 描述了这种更强大的实现。
    Starting from PostgreSQL 9.2, you can add a streaming replication connection that is used for WAL streaming and significantly reduce RPO. This more robust implementation is depicted in figure 5.
    在这里插入图片描述
举报

相关推荐

0 条评论