Introduction

Many automation platforms start with a single-node architecture. One process serves the UI, executes jobs, manages schedules, and handles real-time updates. While this setup is simple, it also creates a single point of failure. If that process stops, automation stops as well.

For organizations running large-scale infrastructure automation — especially in DevOps or platform engineering environments — downtime of the automation system can become a serious operational risk.

This is where Active-Active High Availability (HA) comes in. Semaphore UI supports an architecture where multiple instances run simultaneously and share the workload. This ensures that automation continues even if individual nodes fail, while also enabling horizontal scaling.

The Problem with Single-Node Automation

In a standard deployment, one Semaphore UI process performs all core responsibilities:

This architecture works well for small teams or testing environments. However, it introduces a critical limitation. If the single process goes down, the entire system becomes unavailable. For production environments with many users and automated workflows, this creates risks such as:

To eliminate this bottleneck, Semaphore UI supports active-active high availability deployments.

What Is Active-Active HA in Semaphore UI?

In an active-active HA architecture, multiple identical Semaphore UI instances run simultaneously behind a load balancer. Unlike traditional failover systems, there is no primary or standby node. Every instance is fully capable of handling UI requests, API calls, scheduled jobs and task execution.

Traffic is distributed across the cluster, and if one instance fails, the others continue serving requests without interruption. This architecture improves both system reliability and execution scalability.

High-Level Architecture

A typical active-active deployment consists of several key components.

  1. Load Balancer. Users connect through a load balancer such as NGINX, HAProxy and cloud load balancers. The load balancer distributes HTTP and WebSocket traffic across available Semaphore nodes.

  2. Semaphore Nodes. Each node runs an identical instance of Semaphore UI. Any node can receive user requests, start automation jobs, process scheduled tasks, and send real-time updates. Because all nodes are equal, the system has no single point of failure at the application layer.

  3. Shared Database. All instances connect to a shared database like PostgreSQL or MySQL. The database acts as the single source of truth for persistent data, including: projects, templates, inventories, schedules, task history, user accounts and RBAC configuration.

  4. Redis Coordination Layer. Redis provides the coordination layer that allows multiple nodes to behave as a single system. It serves three important functions.

How Job Execution Works in an HA Cluster

In a multi-node Semaphore deployment, task execution follows a coordinated flow.

  1. A User Triggers a Task. A user starts a job through the UI or API. The request can land on any Semaphore instance.
  2. Task Metadata Is Stored. The receiving node writes task metadata to the database and signals work through Redis.
  3. A Node Picks the Task. One of the available nodes retrieves the task from Redis, acquires a distributed lock, and marks the task as running in the database
  4. The Task Executes. The node executes the task locally, or via distributed runners / agents. Progress and logs are continuously written back to the database.
  5. Results Are Broadcast. Task updates propagate through Redis Pub/Sub so all nodes and UI clients remain synchronized.

Horizontal Scaling with Multiple Runners

High availability also enables horizontal scaling of task execution. Instead of running jobs only on the Semaphore nodes themselves, execution can be delegated to multiple runners or agents. For large DevOps teams, this architecture supports enterprise-scale automation workloads:

Benefits of Active-Active Semaphore Deployment

Improved Reliability. If one instance fails, others continue serving traffic and executing jobs.

Zero-Downtime Maintenance. Nodes can be updated or restarted without stopping the system.

Horizontal Scalability. Additional Semaphore nodes can be added behind the load balancer to increase capacity.

No Primary Node Dependency. All nodes are equal, removing complex failover mechanisms. Consistent State Across the Cluster. Shared database storage and Redis coordination keep all instances synchronized.

Typical Use Cases

Active-active Semaphore deployments are common in environments that require continuous automation availability, such as:

Conclusion

Active-active high availability allows Semaphore UI to evolve from a single automation server into a resilient distributed system. Instead of relying on one process to handle the UI, scheduling, and task execution, multiple Semaphore instances work together behind a load balancer while sharing state through a database and Redis. The result is an automation platform that remains available even when individual nodes fail and can scale horizontally as the workload grows.

Support for active-active HA is available in the Semaphore Enterprise edition. In addition to high availability, the Enterprise version includes capabilities designed for larger teams and production environments — such as advanced RBAC (role-based access control), which allows organizations to define granular permissions across projects, teams, and environments.

If you are evaluating a high-availability automation platform for your infrastructure, you can request a trial of the Enterprise edition to test the HA architecture and see how it fits into your DevOps workflows.

Request an Enterprise Trial

FAQ

What is active-active high availability?

Active-active high availability means multiple application instances run simultaneously, and all of them can serve requests. There is no primary node — any instance can handle traffic and execute jobs.

Why does Semaphore use Redis in HA mode?

Redis acts as a coordination layer between instances. It provides distributed locks, shared task queue state, and Pub/Sub messaging to ensure nodes do not execute the same job simultaneously.

What database does Semaphore support for HA deployments?

Semaphore supports PostgreSQL and MySQL as the shared database for storing persistent system data.

What happens if one Semaphore node fails?

The load balancer simply routes traffic to the remaining nodes. Running jobs continue, and new jobs are picked up by other instances.

Can Semaphore scale horizontally?

Yes. Additional Semaphore nodes and runners can be added to increase execution capacity and handle larger automation workloads.