Version: Release 22.2

Cluster Mode

This topic describes how to install and upgrade Release as a cluster. Running Release in a cluster mode lets you have a Highly Available (HA) Release setup. Release supports the following HA mode.

Active/Active: Three or more Release nodes running simultaneously to process all requests. A load balancer is used to distribute requests.

Cluster configuration

Prerequisites

Using Release in cluster mode requires the following:

Release must be installed according to the system requirements. For more information, see requirements for installing Release.
The Release repository and archive must be stored in an external database, as described in configure the Release repository in a data database topic.

note

Cluster mode is not supported for the default configuration with an embedded database.

A load balancer. For more information, see the HAProxy load balancer documentation.
A Java Messaging System (JMS) for the Webhooks functionality. For more information, see Webhooks overview.
The time on both Release nodes must be synchronized through an NTP server.
The servers running Release must run on the same operating system.
Release servers and load balancers must be on the same network.

important

All the Release cluster nodes must reside in the same network segment. This is required for the clustering protocol to function correctly. For optimal performance, it is also recommended that you put the database server in the same network segment to minimize network latency.

*When you are using Release in cluster mode, you must specify a shared directory to store generated reports.

You can specify the location of your shared directory in the xl-release.conf file. The parameter that stores the default location is:

xl.reporting.engine.location

The default parameter value is reports, and it holds the path of the reports directory, relative to the XL_RELEASE installation directory.

Setup Procedure

The initial cluster setup is:

A load balancer
A database server
Three Release servers

Important: It is recommended to have a multi-node setup with odd number of nodes to facilitate high fault tolerance in production environments. It is also recommended not to have a cluster with more than five nodes to prevent database latency issues. You can, however, with some database configuration tuning, have a cluster with more than five nodes. Contact Digital.ai Support for more information about setting up a cluster with more than five nodes.

To set up the cluster, perform the following configuration steps before starting Release.

Step 1 - Set Up External Databases

See Configure the Release SQL repository in a database.
If you are upgrading Release on a new cluster, set up the database server in the new cluster, back up your database from the existing Release cluster, and restore the same on the new database server.

important

Both the xlrelease repository and the reporting archive must be configured in an external database.

Note: In Release, you have the PostgreSQL streaming replication set up to to create a high availability (HA) cluster configuration with one or more standby servers ready to take over operations if the primary server fails. For more information, see PostgreSQL Streaming Replication in Release.

Step 2 - Set Up the Cluster in the Release Application Configuration File

All Active/Active configuration settings are specified in the XL_RELEASE_SERVER_HOME/conf/xl-release.conf file, which uses the HOCON format.

Enable clustering by setting xl.cluster.mode to full (active/active).
Define ports for different types of incoming TCP connections in the xl.cluster.node section:

Parameter	Description
`xl.cluster.mode`	Possible values: `default` (single node, no cluster); `full` (active/active). Use this property to turn on the cluster mode by setting it to `full`.
`xl.cluster.name`	A label to identify the cluster.
`xl.cluster.node.id`	Unique ID that identifies this node in the cluster.
`xl.cluster.node.hostname`	IP address or host name of the machine where the node is running. Note that a loopback address such as `127.0.0.1` or `localhost` should not be used.
`xl.cluster.node.clusterPort`	Port used for cluster-wide communications; defaults to `5531`.
`xl.queue.embedded`	Possible values: `true` or `false`. Set this to `false` if you want to use the webhooks feature.

Sample configuration

This is an example of the xl-release.conf configuration for an active/active setup:

    xl {
        cluster {
            mode = full
            name = "xlr-cluster"
            node {
                clusterPort = 5531
                hostname = "xlrelease-1.example.com"
                id = "xlrelease-1"
            }
        }
        database {
            ...
        }
        queue {
          embedded = false
          ...
        }
    }

Note: If you are upgrading Release, you can use the existing Release cluster's xl-release.conf file. Copy the existing xl-release.conf file to the new server and update the file with any changes to the cluster name (xl.cluster.name) hostname (xl.cluster.node.hostname), and so on.

Important: If you want to use the webhooks feature in a High Availability (cluster mode) setup, the JMS queue cannot be embedded. It must be external and shared by all nodes in the Release cluster.

Step 3 - Set Up the First Node

Open a command prompt and run the following server set up command:

Fresh Installation

./bin/run.sh -setup

Upgrade

./run.sh -setup -previous-installation `XL_RELEASE_SERVER_HOME_EXISTING`

Follow the on-screen instructions.

Step 4 - Prepare Another Node in the Cluster

Zip the contents of the XL_RELEASE_SERVER_HOME/ folder from the first node.
Copy the ZIP file to another node and unzip it.
Edit the xl.cluster.node section of the XL_RELEASE_SERVER_HOME/conf/xl-release.conf file.
Update the values for the specific node.

note

You do not need to run the server setup command on each node.

Step 5 - Set Up the Load Balancer

When running in cluster mode, you must configure a load balancer to route the requests to the available servers.

The load balancer checks the /ha/health endpoint with a HEAD or GET request to verify that the node is up. This endpoint will return:

A 200 OK HTTP status code if it is the currently active node

This is a sample haproxy.cfg configuration for HAProxy. Ensure that your configuration is hardened before using it in a production environment.

global
  log 127.0.0.1 local0
  log 127.0.0.1 local1 notice
  log-send-hostname
  maxconn 4096
  pidfile /var/run/haproxy.pid
  user haproxy
  group haproxy
  daemon
  stats socket /var/run/haproxy.stats level admin
  ssl-default-bind-options no-sslv3
  ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:AES128-GCM-SHA256:AES128-SHA256:AES128-SHA:AES256-GCM-SHA384:AES256-SHA256:AES256-SHA:DHE-DSS-AES128-SHA
defaults
  balance roundrobin
  log global
  mode http
  option redispatch
  option httplog
  option dontlognull
  option forwardfor
  timeout connect 5000
  timeout client 50000
  timeout server 50000
listen stats
  bind :1936
  mode http
  stats enable
  timeout connect 10s
  timeout client 1m
  timeout server 1m
  stats hide-version
  stats realm Haproxy\ Statistics
  stats uri /
  stats auth stats:stats
frontend default_port_80
  bind :80
  reqadd X-Forwarded-Proto:\ http
  maxconn 4096
  default_backend default_service
backend default_service
  cookie JSESSIONID prefix
  option httpchk HEAD /ha/health HTTP/1.0
  server node_1 node_1:5516 cookie node_1 check inter 2000 rise 2 fall 3
  server node_2 node_2:5516 cookie node_2 check inter 2000 rise 2 fall 3

important

Previous versions of Release included a sample haproxy.cfg configuration for HAProxy that had the following line:

option httpchk head /ha/health HTTP/1.0

Starting with version 9.5, the head value is case-sensitive and must be expressed as HEAD.

Release does not share HTTP sessions among nodes. If the active Release node becomes unavailable:

All users will effectively be logged out and will lose any data that was not stored to the database.
Any script tasks that were running on the previously active node will have the failed status. When a new node becomes the active node, which happens automatically, you can restart failed tasks.

note

Performing a TCP check or GET operation on / will indicate that a node is running.

Step 6 - Start the Nodes

Beginning with the first node that you configured, start Release on each node. Ensure that each node is fully up and running before starting the next one.

Advanced Configuration

Network Split Resolution

In the case of a network split, the Release cluster has a default strategy configured to avoid the creation of multiple independent cluster partitions from the original cluster. The default configured strategy is the MajorityLeaderAutoDowningProvider.

This auto-downing strategy shuts down every cluster partition which is in minority. For example: partition size < cluster size / 2).

When the cluster is split into two parts, partition size == cluster size / 2, the partition containing the oldest active cluster member will survive. If there are no partitions containing the sufficient number of members, the quorum cannot be achieved and the whole cluster will be shutdown. If this occurs, an external restart of the cluster is required.

An alternative strategy, available by default, is the OldestLeaderAutoDowningProvider. This strategy can be activated in the XL_RELEASE_SERVER_HOME/conf/xl-release.conf file by specifying:

xl {
    cluster {
        akka {
            cluster {
                downing-provider-class = "com.xebialabs.xlplatform.cluster.full.downing.OldestLeaderAutoDowningProvider"
            }
        }
    }
    ...
}

This strategy will keep the partition with the oldest active node alive. It is suitable for a Release cluster which needs to stay up as long as possible, without depending on the number of members in the partitions.

Prerequisites​

Setup Procedure​

Step 1 - Set Up External Databases​

Step 2 - Set Up the Cluster in the Release Application Configuration File​

Sample configuration​

Step 3 - Set Up the First Node​

Step 4 - Prepare Another Node in the Cluster​

Step 5 - Set Up the Load Balancer​

Limitation on HTTP Session Sharing and Resiliency in Cluster Setups​

Step 6 - Start the Nodes​

Advanced Configuration​

Network Split Resolution​

Prerequisites

Setup Procedure

Step 1 - Set Up External Databases

Step 2 - Set Up the Cluster in the Release Application Configuration File

Sample configuration

Step 3 - Set Up the First Node

Step 4 - Prepare Another Node in the Cluster

Step 5 - Set Up the Load Balancer

Limitation on HTTP Session Sharing and Resiliency in Cluster Setups

Step 6 - Start the Nodes

Advanced Configuration

Network Split Resolution