Prerequisites v4

Before configuring a Failover Manager cluster, you must satisfy these prerequisites.

Install Java 11 (or later)

Before using Failover Manager, you must first install Java (version 11 or later). Failover Manager is tested with OpenJDK, and we strongly recommend installing that version of Java. Installation instructions for Java are platform specific.

Note

There's a temporary issue with OpenJDK version 11 on RHEL and its derivatives. When starting Failover Manager, you might see an error like the following:

java.lang.Error: java.io.FileNotFoundException: /usr/lib/jvm/java-11-openjdk-11.0.20.0.8-2.el8.x86_64/lib/tzdb.dat (No such file or directory)

If you see this message, the workaround is to manually install the missing package using the command sudo dnf install tzdata-java.

Provide an SMTP server

You can receive notifications from Failover Manager as specified by a user-defined notification script, by email, or both.

  • If you're using email notifications, an SMTP server must be running on each node of the Failover Manager scenario.
  • If you provide a value in the script.notification property, you can leave the user.email field blank. An SMTP server isn't required.

If an event occurs, Failover Manager invokes the script (if provided) and can also send a notification email to any email addresses specified in the user.email parameter of the cluster properties file. For more information about using an SMTP server, see the Red Hat deployment guide.

Configure streaming replication

Failover Manager requires that PostgreSQL streaming replication be configured between the primary node and the standby nodes. Failover Manager doesn't support other types of replication.

On database versions 11 or earlier, unless specified with the -sourcenode option, a recovery.conf file is copied from a random standby node to the stopped primary during switchover. Ensure that the paths in the recovery.conf file on your standby nodes are consistent before performing a switchover. For more information about the -sourcenode option, see Promoting a Failover Manager node.

On database version 12 or later, the primary_conninfo and restore_command properties are copied from a random standby node to the stopped primary during switchover unless otherwise specified with the -sourcenode option.

Modify pg_hba.conf

You must modify pg_hba.conf on the primary and standby nodes, adding entries that allow communication between all of the nodes in the cluster. This example shows entries you might make to the pg_hba.conf file on the primary node:

# access for itself
host fmdb efm 127.0.0.1/32 md5
# access for standby
host fmdb efm 192.168.27.1/32 md5
# access for witness
host fmdb efm 192.168.27.34/32 md5

Where:

efm specifies the name of a valid database user.

fmdb specifies the name of a database to which the efm user can connect.

By default, the pg_hba.conf file resides in the data directory under your Postgres installation. After modifying the pg_hba.conf file, for the changes to take effect, you must reload the configuration file on each node. You can use the following command:

# systemctl reload edb-as-<x>

Where x specifies the Postgres version.

Using autostart for the database servers

If a primary node restarts, Failover Manager might detect the database is down on the primary node and promote a standby node to the role of primary. If this happens, the Failover Manager agent on the restarted primary node doesn't get a chance to write the recovery.conf file, and the recovery.conf file prevents the database server from starting. In this case, the rebooted primary node returns to the cluster as a second primary node.

To prevent this condition, ensure that the Failover Manager agent auto starts before the database server. The agent starts in idle mode and checks to see if there's already a primary in the cluster. If there's a primary node, the agent verifies that a recovery.conf or standby.signal file exists. If neither file exits, the agent creates the recovery.conf file.

Ensure communication through firewalls

If a Linux firewall (that is, iptables) is enabled on the host of a Failover Manager node, you might need to add rules to the firewall configuration that allow tcp communication between the Failover Manager processes in the cluster. For example:

# iptables -I INPUT -p tcp --dport 7800 -j ACCEPT
/sbin/service iptables save

This command opens the port 7800. Failover Manager connects by way of the port that corresponds to the port specified in the cluster properties file.

Ensure that the database user has sufficient privileges

The database user specified by the db.user property in the efm.properties file must have sufficient privileges to invoke the following functions on behalf of Failover Manager:

pg_current_wal_lsn()

pg_last_wal_replay_lsn()

pg_wal_replay_resume()

pg_wal_replay_pause()

If the reconfigure.num.sync or reconfigure.sync.primary property is set to true, then:

  • For database versions 11 and later, the db.user requires pg_read_all_stats privilege and permissions to run pg_reload_conf().

For detailed information about each of these functions, see the PostgreSQL core documentation.

If the update.physical.slots.period property is used, then the db.user requires the REPLICATION privilege. A database superuser can provide the permissions needed:

ALTER USER <user_name> REPLICATION;

The user must also have permissions to read the values of configuration variables. A database superuser can use the PostgreSQL GRANT command to provide the permissions needed:

GRANT pg_read_all_settings TO <user_name>;

For more information about pg_read_all_settings, see the PostgreSQL core documentation.