Skip to main content
Version: Deploy 22.3

Set up an Active-Active Cluster

This topic describes how to set up an active-active cluster for Deploy with multiple master and multiple external workers.

HA Active-Active Setup

  • In a HA cluster configuration, there are two types of cluster setups: Active-Active and Active Hot Standby.
  • Digital.ai Deploy is recommended to be set up in a multi-node, Active-Active cluster, as it provides realtime load balancing capability.
  • In a multi-node setup, you can have multiple masters and multiple workers, with the workers connected to each of the masters.
  • A master node is a Deploy server you log on to in order to access the Deploy user interface and create configuration items, such as infrastructure, environments, applications, and so on.
  • The multiple masters ensure continuous functioning in case of failover and the multiple workers ensure uninterrupted task execution and scalability.
  • Deployment tasks and Control tasks are assigned to the Deploy worker that performs the task execution. For more information, see Tasks, rules, and configuration Items.
  • The multiple masters are controlled by a load balancer.

Important: The Deploy masters and workers must have similar configuration (database, ActiveMQ, plugins, and so on). A worker cannot run the tasks assigned by the master if the configuration differs.

Requirements to Set Up an Active-Active Cluster

  • Three or more Deploy stateless master nodes
  • Three or more external Deploy worker nodes
  • An external database server, for example PostgreSQL
  • A load balancer that receives HTTP(S) traffic and forwards that to the Deploy master nodes. For more information, see HAProxy load balancer documentation.
  • ActiveMQ
  • A shared drive location to store exported CIs and reports.

Important: It is always recommended to have a multi-node setup with odd number of nodes to facilitate high fault tolerance in production environments. It is also recommended not to have a cluster with more than five nodes to prevent database latency issues. You can, however, with some database configuration tuning, have a cluster with more than five nodes. Contact Digital.ai Support for more information about setting up a cluster with more than five nodes.

Image

About the Setup

Let us consider the following multi-node Deploy setup for illustrative purposes:

  • Three Deploy masters (Master 1, Master 2, and Master 3)
  • Three workers (Worker 1, Worker 2, and Worker 3)
  • One load balancer
  • One ActiveMQ server (JMS broker)
  • One database server—PostgreSQL

Step 1: Download and extract the latest Deploy application to Master 1 Server

  1. Create a folder for installation from where you will execute the installation tasks. This will be the root directory.
  2. Download the Deploy ZIP package from the Deploy Software Distribution site (requires customer login).

Step 2: Install the Deploy license

  1. Download the Deploy license file from the Deploy Software Distribution site
  2. Copy your license file to the conf directory of the root directory.

Step 3: Configure the external Postgres database node

You must set up a database for the Deploy to store data. Create and configure an empty database before you start Deploy installation. During the installation, Deploy creates the database schema on the database you created. Optionally, you can have separate database for the operational and reporting databases. For more information about creating two separate databases, see Separate Databases for Reporting and Repository in Deploy.

Use a industrial-grade external database server, for example PostgreSQL, for production use. For more information, see Configure the Database and Artifacts Repository.

For more information about how to create Postgres database, see the PostgreSQL documentation.

Step 4: Configure the Postgres JDBC driver

  1. Download the Postgres JDBC driver. See PostgreSQL JDBC driver.
  2. Copy the JAR file to the lib folder of the root directory.

Step 5: Update the deploy-repository.yaml file with the external database details

  1. In the root folder, go to centralConfiguration folder and open the deploy-repository.yaml file.
  2. Configure the parameters to point to the database schema as shown in the following sample configuration:
xl:
repository:
database:
db-driver-class-name: "org.postgresql.Driver"
db-password: "samplepassword"
db-url: "jdbc:postgresql://<IP address of the external database server>/postgres"
db-username: "sample-user"
max-pool-size: 10
artifacts.root: "build/artifacts"

Step 6: Download the ActiveMQ client

  1. Download the JMS message broker—ActiveMQ client.
  2. Add JMS .jar file—org.apache.activemq:activemq-client to thelib folder of the root directory.

Step 7: Set up the ActiveMQ node

  1. Run ActiveMQ in the Docker container.

  2. From the centralConfiguration folder of the root directory, open deploy-task.yaml file.

  3. Set the in-process-worker parameter to false.

  4. Update the JMS details as shown in the sample configuration:

    deploy:
    queue:
    external:
    jms-driver-classname: org.apache.ActiveMQConnectionFactory
    jms-url: tcp://<IP address of the ActiveMQ server node>:61616
    jms-username: admin
    jms-password: admin

Step 8: Set up the Load Balancer node

  1. To install the HA proxy (load balancer), run:
    yum install -y haproxy
  2. Enable the HA proxy by running the following command:
    systemctl enable haproxy
  3. To start the HA proxy, run:
    systemctl start haproxy
  4. Run the following command to check the status:
    systemctl status haproxy

Step 9: Update the hostname of the Master 1 node

  1. From the conf folder of the root directory, open the deployit.conf file.
  2. Update the server.hostname parameter with the hostname of the master server.
  3. Update the deploy.cluster.node.port parameter with the port number. For example, deploy.cluster.node.port=25520—default port number is 25520.

Step 10: Update the deploy-cluster.yaml file with the cluster mode

Digital.ai Deploy 22.3 brings you the following changes related to the high availability (HA) cluster setup:

  • You can no longer set mode to default in the deploy-cluster.yaml file and have Digital.ai Deploy run in HA active-active cluster mode.
  • You must set mode to full in the deploy-cluster.yaml file if you want to run Deploy in HA active-active cluster mode.

Here's an example deploy-cluster.yaml configuration file with mode set to full.

deploy:
cluster:
akka:
actor:
loggers:
- akka.event.slf4j.Slf4jLogger
loglevel: INFO
provider: akka.cluster.ClusterActorRefProvider
cluster:
auto-down-unreachable-after: 15s
custom-downing:
down-removal-margin: 10s
stable-after: 10s
downing-provider-class: ''
membership:
heartbeat: 10 seconds
jdbc:
connection-timeout: 30 seconds
idle-timeout: 10 minutes
leak-connection-threshold: 15 seconds
max-life-time: 30 minutes
max-pool-size: 1
minimum-idle: 1
password: '{cipher}gfdqswdksahgdksahgdkas'
pool-name: ClusterPool
url: ''
username: ''
ttl: 60 seconds
mode: full
name: xld-active-cluster
  • The deploy.cluster.akka.remote object has been removed from the deploy-cluster.yaml file.

Provide database access to register active nodes to a membership table by adding a cluster.membership configuration containing the following keys:

ParameterDescription
jdbc.urlJDBC URL that describes the database connection details; for example, "jdbc:oracle:thin:@oracle.hostname.com:1521:SID".
jdbc.usernameUser name to use when logging into the database.
jdbc.passwordPassword to use when logging into the database. After you complete the setup, the password will be encrypted and stored in a secured format.
jdbc.leak-connection-thresholdThis property controls the amount of time that a connection can be out of the pool before a message is logged indicating a possible connection leak. A value of 0 means leak detection is disabled. The lowest acceptable value for enabling leak detection is 2 seconds. Increase the leak-connection-threshold to 2 or 3 minutes if the connection fails due to JDBC connection leaks.

You can set up Deploy to reuse the same database URL, username, and password for both the cluster membership information and for the repository configuration as set in the deploy-repository.yaml file.

Here's an example deploy-cluster.yaml file:

deploy.cluster:
mode: full
membership:
jdbc:
connection-timeout: 30 seconds
idle-timeout: 10 minutes
leak-connection-threshold: 2 minutes
max-life-time: 30 minutes
max-pool-size: 1
minimum-idle: 1
password: '{cipher}dxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
pool-name: ClusterPool
url: "jdbc:mysql://db/xldrepo?useSSL=false"
username: <my_username>

Step 10: Start the Master 1 node

  1. Start by Master 1 node by running the following command:

    On Unix systems:

    .bin/run.sh -setup

    On Windows:

    run.cmd -setup
  2. Follow the on-screen instructions.

Note: You do not need to run the server setup command for each master node.

Step 11: Copy the configuration to all the master nodes in the cluster

  1. Compress the configuration you created in the root directory of the Master 1 node into a ZIP file.
  2. Copy the ZIP file to all other nodes and extract the zip file on all the master nodes.

Step 12: Run Deploy as a Service

  1. After you confirm the Deploy Master 1 runs without issue, you can install Deploy as a service by running the following command:

On Unix systems:

bin/install-service.sh

On Windows:

bin\install-service.cmd

  1. Verify the logs by running the following command:
    cat log/deployit.log.

Step 13: Configure the load balancer

  1. In the load balancer server, go to the /etc/haproxy/haproxy.cfg file, and add the following:

    global
    log 127.0.0.1 local0
    log 127.0.0.1 local1 notice
    log-send-hostname
    maxconn 4096
    pidfile /var/run/haproxy.pid
    user haproxy
    group haproxy
    daemon
    stats socket /var/run/haproxy.stats level admin
    ssl-default-bind-options no-sslv3
    ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:DHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:AES128-GCM-SHA256:AES128-SHA256:AES256-GCM-SHA384:AES256-SHA256
    defaults
    balance roundrobin/source
    log global
    mode http
    option redispatch
    option httplog
    option dontlognull
    option forwardfor
    timeout connect 5000
    timeout client 50000
    timeout server 50000

    listen stats
    bind :1936
    mode http
    stats enable
    timeout connect 10s
    timeout client 1m
    timeout server 1m
    stats hide-version
    stats realm Haproxy\ Statistics
    stats uri /
    stats auth stats:stats

    frontend load_balancer

    bind :80
    reqadd X-Forwarded-Proto:\ http ( From 2.1x , use http-request add-header X-Forwarded-Proto http )
    maxconn 4096
    default_backend default_service

    backend default_service
    option httpchk HEAD /deployit/ha/health HTTP/1.0
    <Hostname of the Master 1 node> MASTER1_IP:4516 check inter 2000 rise 2 fall 3
    <Hostname of the Master 2 node> MASTER2_IP:4516 check inter 2000 rise 2 fall 3
    <Hostname of the Master 3 node> MASTER3_IP:4516 check inter 2000 rise 2 fall 3

Note: The bind key defines the port number that is alloted to the load balancer node. Make sure you update the hostname and IP addresses for the master nodes.

Step 14: Run other master nodes in the cluster

  1. Update the hostname parameter with the IP address of the master server. See Step 9.
  2. Repeat Step 12: Run Deploy as a Service on all the master nodes.

Step 15: Configure the worker

  1. Download the Deploy-Task-Engine from the Deploy Software Distribution site (requires customer log in).
  2. Extract the zip file to a worker node. This will be one of the Deploy workers.
  3. Remove the existing plugins directory from the Deploy Task Engine folder to avoid mismatch with the plugins you will copy from the Deploy Master in the next step.
  4. Copy the deployit.conf from the XL_DEPLOY_SERVER/conf/ directory to the DEPLOY_TASK_ENGINE/conf directory. Note: This step is not required if you have installed Central Configuration as a standalone service.

Step 16: Synchronize the master and worker nodes

  1. Synchronize the Deploy master and worker by copying the following artifacts from the Master 1 node to the Deploy Task Engine folder:

    • hotfix/plugins
    • ext
    • plugins/__local__
    • plugins/xld_official

    Note: The worker uses rest of the configuration from the CentralConfiguration directory (Deploy Master or a separate Central Configuration server if configured).

Step 17: Add drivers to the lib folder

Add the database and JMS broker drivers to the lib folder of Deploy worker folder.

Step 18: Start the Deploy Task Engine

  1. Start by Deploy Task Engine by running the command shown in the following example:

    On Unix systems:

    DEPLOY_TASK_ENGINE_HOME/bin/run.sh -api http://xld-master-host:4516/ -master xld-master-host:8180 -port 8182 -name worker1

    On Windows:

    DEPLOY_TASK_ENGINE_HOME\bin\run.cmd -api http://xld-master-host:4516/ -master xld-master-host:8180 -port 8184 -name worker1
  2. Run the above command on all the worker nodes.

Step 19: Verify the worker status in Deploy interface

  1. Log in to Deploy.
  2. Click Monitoring > Workers.
  3. In the Worker Overview window, verify the task-engine and in-process worker states.
    The Deploy Task Engine (task-engine) will be in Connected state, while the In-process-worker will be in Disconnected state.