Version: Deploy 22.3

Set up an Active-Active Cluster

This topic describes how to set up an active-active cluster for Deploy with multiple master and multiple external workers.

HA Active-Active Setup

In a HA cluster configuration, there are two types of cluster setups: Active-Active and Active Hot Standby.
Digital.ai Deploy is recommended to be set up in a multi-node, Active-Active cluster, as it provides realtime load balancing capability.
In a multi-node setup, you can have multiple masters and multiple workers, with the workers connected to each of the masters.
A master node is a Deploy server you log on to in order to access the Deploy user interface and create configuration items, such as infrastructure, environments, applications, and so on.
The multiple masters ensure continuous functioning in case of failover and the multiple workers ensure uninterrupted task execution and scalability.
Deployment tasks and Control tasks are assigned to the Deploy worker that performs the task execution. For more information, see Tasks, rules, and configuration Items.
The multiple masters are controlled by a load balancer.

Important: The Deploy masters and workers must have similar configuration (database, ActiveMQ, plugins, and so on). A worker cannot run the tasks assigned by the master if the configuration differs.

Requirements to Set Up an Active-Active Cluster

Three or more Deploy stateless master nodes
Three or more external Deploy worker nodes
An external database server, for example PostgreSQL
A load balancer that receives HTTP(S) traffic and forwards that to the Deploy master nodes. For more information, see HAProxy load balancer documentation.
ActiveMQ
A shared drive location to store exported CIs and reports.

Important: It is always recommended to have a multi-node setup with odd number of nodes to facilitate high fault tolerance in production environments. It is also recommended not to have a cluster with more than five nodes to prevent database latency issues. You can, however, with some database configuration tuning, have a cluster with more than five nodes. Contact Digital.ai Support for more information about setting up a cluster with more than five nodes.

About the Setup

Let us consider the following multi-node Deploy setup for illustrative purposes:

Three Deploy masters (Master 1, Master 2, and Master 3)
Three workers (Worker 1, Worker 2, and Worker 3)
One load balancer
One ActiveMQ server (JMS broker)
One database server—PostgreSQL

Step 1: Download and extract the latest Deploy application to Master 1 Server

Create a folder for installation from where you will execute the installation tasks. This will be the root directory.
Download the Deploy ZIP package from the Deploy Software Distribution site (requires customer login).

Step 2: Install the Deploy license

Download the Deploy license file from the Deploy Software Distribution site
Copy your license file to the conf directory of the root directory.

Step 3: Configure the external Postgres database node

You must set up a database for the Deploy to store data. Create and configure an empty database before you start Deploy installation. During the installation, Deploy creates the database schema on the database you created. Optionally, you can have separate database for the operational and reporting databases. For more information about creating two separate databases, see Separate Databases for Reporting and Repository in Deploy.

Use a industrial-grade external database server, for example PostgreSQL, for production use. For more information, see Configure the Database and Artifacts Repository.

For more information about how to create Postgres database, see the PostgreSQL documentation.

Step 4: Configure the Postgres JDBC driver

Download the Postgres JDBC driver. See PostgreSQL JDBC driver.
Copy the JAR file to the lib folder of the root directory.

Step 5: Update the `deploy-repository.yaml` file with the external database details

In the root folder, go to centralConfiguration folder and open the deploy-repository.yaml file.
Configure the parameters to point to the database schema as shown in the following sample configuration:

xl:
  repository:
    database:
     db-driver-class-name: "org.postgresql.Driver"
     db-password: "samplepassword"
     db-url: "jdbc:postgresql://<IP address of the external database server>/postgres"
     db-username: "sample-user"
     max-pool-size: 10
   artifacts.root: "build/artifacts"

Step 6: Download the ActiveMQ client

Download the JMS message broker—ActiveMQ client.
Add JMS .jar file—org.apache.activemq:activemq-client to thelib folder of the root directory.

Step 7: Set up the ActiveMQ node

Run ActiveMQ in the Docker container.
From the centralConfiguration folder of the root directory, open deploy-task.yaml file.
Set the in-process-worker parameter to false.
Update the JMS details as shown in the sample configuration:

      deploy:
        task:
          queue:
            external:
              jms-driver-classname: org.apache.ActiveMQConnectionFactory
              jms-url: tcp://<IP address of the ActiveMQ server node>:61616
              jms-username: admin
              jms-password: admin

Step 8: Set up the Load Balancer node

To install the HA proxy (load balancer), run:
yum install -y haproxy
Enable the HA proxy by running the following command:
systemctl enable haproxy
To start the HA proxy, run:
systemctl start haproxy
Run the following command to check the status:
systemctl status haproxy

Step 9: Update the hostname of the Master 1 node

From the conf folder of the root directory, open the deployit.conf file.
Update the server.hostname parameter with the hostname of the master server.
Update the deploy.cluster.node.port parameter with the port number. For example, deploy.cluster.node.port=25520—default port number is 25520.

Step 10: Update the `deploy-cluster.yaml` file with the cluster mode

Digital.ai Deploy 22.3 brings you the following changes related to the high availability (HA) cluster setup:

You can no longer set mode to default in the deploy-cluster.yaml file and have Digital.ai Deploy run in HA active-active cluster mode.
You must set mode to full in the deploy-cluster.yaml file if you want to run Deploy in HA active-active cluster mode.

Here's an example deploy-cluster.yaml configuration file with mode set to full.

deploy:
  cluster:
    akka:
      actor:
        loggers:
        - akka.event.slf4j.Slf4jLogger
        loglevel: INFO
        provider: akka.cluster.ClusterActorRefProvider
      cluster:
        auto-down-unreachable-after: 15s
        custom-downing:
          down-removal-margin: 10s
          stable-after: 10s
        downing-provider-class: ''
    membership:
      heartbeat: 10 seconds
      jdbc:
        connection-timeout: 30 seconds
        idle-timeout: 10 minutes
        leak-connection-threshold: 15 seconds
        max-life-time: 30 minutes
        max-pool-size: 1
        minimum-idle: 1
        password: '{cipher}gfdqswdksahgdksahgdkas'
        pool-name: ClusterPool
        url: ''
        username: ''
      ttl: 60 seconds
    mode: full
    name: xld-active-cluster

The deploy.cluster.akka.remote object has been removed from the deploy-cluster.yaml file.

Provide database access to register active nodes to a membership table by adding a cluster.membership configuration containing the following keys:

Parameter	Description
`jdbc.url`	JDBC URL that describes the database connection details; for example, `"jdbc:oracle:thin:@oracle.hostname.com:1521:SID"`.
`jdbc.username`	User name to use when logging into the database.
`jdbc.password`	Password to use when logging into the database. After you complete the setup, the password will be encrypted and stored in a secured format.
`jdbc.leak-connection-threshold`	This property controls the amount of time that a connection can be out of the pool before a message is logged indicating a possible connection leak. A value of 0 means leak detection is disabled. The lowest acceptable value for enabling leak detection is 2 seconds. Increase the `leak-connection-threshold` to 2 or 3 minutes if the connection fails due to JDBC connection leaks.

You can set up Deploy to reuse the same database URL, username, and password for both the cluster membership information and for the repository configuration as set in the deploy-repository.yaml file.

Here's an example deploy-cluster.yaml file:

deploy.cluster:
  mode: full
  membership:
    jdbc:
      connection-timeout: 30 seconds
      idle-timeout: 10 minutes
      leak-connection-threshold: 2 minutes
      max-life-time: 30 minutes
      max-pool-size: 1
      minimum-idle: 1
      password: '{cipher}dxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
      pool-name: ClusterPool
      url: "jdbc:mysql://db/xldrepo?useSSL=false"
      username: <my_username>

Step 10: Start the Master 1 node

Start by Master 1 node by running the following command:

On Unix systems:
```
.bin/run.sh -setup
```
On Windows:
```
run.cmd -setup
```
Follow the on-screen instructions.

Note: You do not need to run the server setup command for each master node.

Step 11: Copy the configuration to all the master nodes in the cluster

Compress the configuration you created in the root directory of the Master 1 node into a ZIP file.
Copy the ZIP file to all other nodes and extract the zip file on all the master nodes.

Step 12: Run Deploy as a Service

After you confirm the Deploy Master 1 runs without issue, you can install Deploy as a service by running the following command:

On Unix systems:

bin/install-service.sh

On Windows:

bin\install-service.cmd

Verify the logs by running the following command:
cat log/deployit.log.

Step 13: Configure the load balancer

In the load balancer server, go to the /etc/haproxy/haproxy.cfg file, and add the following:

global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
log-send-hostname
maxconn 4096
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon
stats socket /var/run/haproxy.stats level admin
ssl-default-bind-options no-sslv3
ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:AES128-GCM-SHA256:AES128-SHA256:AES256-GCM-SHA384:AES256-SHA256
defaults
balance roundrobin/source
log global
mode http
option redispatch
option httplog
option dontlognull
option forwardfor
timeout connect 5000
timeout client 50000
timeout server 50000

listen stats
bind :1936
mode http
stats enable
timeout connect 10s
timeout client 1m
timeout server 1m
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
stats auth stats:stats

frontend load_balancer

bind :80
reqadd X-Forwarded-Proto:\ http  ( From 2.1x , use http-request add-header X-Forwarded-Proto http )
maxconn 4096
default_backend default_service

backend default_service
option httpchk HEAD /deployit/ha/health HTTP/1.0
<Hostname of the Master 1 node> MASTER1_IP:4516 check inter 2000 rise 2 fall 3
<Hostname of the Master 2 node> MASTER2_IP:4516 check inter 2000 rise 2 fall 3
<Hostname of the Master 3 node> MASTER3_IP:4516 check inter 2000 rise 2 fall 3

Note: The bind key defines the port number that is alloted to the load balancer node. Make sure you update the hostname and IP addresses for the master nodes.

Step 14: Run other master nodes in the cluster

Update the hostname parameter with the IP address of the master server. See Step 9.
Repeat Step 12: Run Deploy as a Service on all the master nodes.

Step 15: Configure the worker

Download the Deploy-Task-Engine from the Deploy Software Distribution site (requires customer log in).
Extract the zip file to a worker node. This will be one of the Deploy workers.
Remove the existing plugins directory from the Deploy Task Engine folder to avoid mismatch with the plugins you will copy from the Deploy Master in the next step.
Copy the deployit.conf from the XL_DEPLOY_SERVER/conf/ directory to the DEPLOY_TASK_ENGINE/conf directory. Note: This step is not required if you have installed Central Configuration as a standalone service.

Step 16: Synchronize the master and worker nodes

Synchronize the Deploy master and worker by copying the following artifacts from the Master 1 node to the Deploy Task Engine folder:
- hotfix/plugins
- ext
- plugins/__local__
- plugins/xld_official
Note: The worker uses rest of the configuration from the CentralConfiguration directory (Deploy Master or a separate Central Configuration server if configured).

Step 17: Add drivers to the `lib` folder

Add the database and JMS broker drivers to the lib folder of Deploy worker folder.

Step 18: Start the Deploy Task Engine

Start by Deploy Task Engine by running the command shown in the following example:

On Unix systems:

DEPLOY_TASK_ENGINE_HOME/bin/run.sh -api http://xld-master-host:4516/ -master xld-master-host:8180 -port 8182 -name worker1

On Windows:

DEPLOY_TASK_ENGINE_HOME\bin\run.cmd -api http://xld-master-host:4516/ -master xld-master-host:8180 -port 8184 -name worker1

Run the above command on all the worker nodes.

Step 19: Verify the worker status in Deploy interface

Log in to Deploy.
Click Monitoring > Workers.
In the Worker Overview window, verify the task-engine and in-process worker states.
The Deploy Task Engine (task-engine) will be in Connected state, while the In-process-worker will be in Disconnected state.

HA Active-Active Setup​

Requirements to Set Up an Active-Active Cluster​

About the Setup​

Step 1: Download and extract the latest Deploy application to Master 1 Server​

Step 2: Install the Deploy license​

Step 3: Configure the external Postgres database node​

Step 4: Configure the Postgres JDBC driver​

Step 5: Update the deploy-repository.yaml file with the external database details​

Step 6: Download the ActiveMQ client​

Step 7: Set up the ActiveMQ node​

Step 8: Set up the Load Balancer node​

Step 9: Update the hostname of the Master 1 node​

Step 10: Update the deploy-cluster.yaml file with the cluster mode​

Step 10: Start the Master 1 node​

Step 11: Copy the configuration to all the master nodes in the cluster​

Step 12: Run Deploy as a Service​

Step 13: Configure the load balancer​

Step 14: Run other master nodes in the cluster​

Step 15: Configure the worker​

Step 16: Synchronize the master and worker nodes​

Step 17: Add drivers to the lib folder​

Step 18: Start the Deploy Task Engine​

Step 19: Verify the worker status in Deploy interface​