3-node Galera Cluster for RDEM Systems PKIaaS

High-availability MariaDB architecture for PKI infrastructure. Synchronous multi-master replication, automatic quorum, fault tolerance. Sovereign hosting in France, CheckMK monitoring.

Cluster technical architecture

5 VMs, 3 datacenters, dedicated VXLAN, daily backup. An infrastructure designed for high availability and resilience.

3 Galera nodes + 2 PKI application nodes

5 VMs in total: 3 MariaDB Galera nodes for synchronous replication and 2 application nodes hosting the PKI service. Topology designed to guarantee quorum and automatic failover.

Dedicated VXLAN network layer

Isolated VXLAN dedicated to MariaDB communications between the 3 Galera nodes and the 2 PKI nodes. No interference with other services, encrypted traffic, dedicated bandwidth.

Proxmox hyperconverged infrastructure

3 VMs hosted on our clusters. One cluster per DC. Live migration, automatic HA, shared storage, centralized management.

VPS XLarge: 8 vCPUs, 32 GB RAM, 400 GB SSD

High-performance machines (VPS XLarge spec) for optimal MariaDB performance: enough RAM to keep working set in memory, fast SSD I/O, multi-core vCPUs for parallel workload.

Backup via Nimbus (Double Drive PBS)

Daily backups of the cluster via Nimbus, RDEM Systems' backup solution based on Proxmox Backup Server. Retention across 2 geographically distributed sites, deduplication, encryption.

Debian 13 Trixie + MariaDB 10.11 LTS

Debian 13 Trixie as the base OS for long-term stability. MariaDB 10.11 (LTS release until 2028) installed via the official MariaDB repository, not Debian packages.

Why Galera Cluster for a PKI?

A PKI (Public Key Infrastructure) infrastructure is critical: it issues and manages X.509 certificates that secure your TLS communications, VPNs, authentication. If the PKI database goes down, no more certificate issuance or revocation is possible. Services that depend on these certificates can no longer be provisioned.

Galera Cluster provides the necessary high availability: synchronous multi-master replication across 3 nodes, automatic fault tolerance, built-in quorum to prevent split-brain.

With this architecture, RDEM Systems PKI can continue to operate even if an entire datacenter is offline. The 2 other Galera nodes maintain quorum and the service remains available.

Synchronous multi-master replication

Each node can handle reads and writes. wsrep protocol ensures every transaction is replicated to all nodes before committing. No replication lag, immediate consistency.

Automatic quorum & split-brain prevention

Galera requires a majority of nodes (quorum) to operate. With 3 nodes, the cluster tolerates 1 failure. If the network is partitioned, only the partition with the majority continues to serve.

Zero-downtime node replacement

Nodes can be taken offline for maintenance, added or removed without service interruption. State Snapshot Transfer (SST) and Incremental State Transfer (IST) handle data synchronization.

Detailed technical stack

OS: Debian 13 Trixie

Debian 13 Trixie (testing at deployment time, stable on release) as the base operating system. Choice of a stable, well-maintained system with long-term support. No exotic distribution, no dependency on a commercial vendor.

MariaDB 10.11 LTS (support until 2028)

MariaDB 10.11 is a Long Term Support release with support until February 2028. Installed via the official MariaDB repository (repo.mariadb.org), not via Debian packages which may lag behind in version. wsrep configuration enabled for Galera Cluster.

Galera Cluster 4.x (wsrep provider)

Galera Cluster 4.x as wsrep provider. Synchronous replication certified via wsrep API. Configuration to tolerate 1 failure: wsrep_cluster_size=3, pc.weight=1 on each node, pc.recovery=true for automatic recovery after network partition.

Dedicated VXLAN for inter-node communications

A dedicated VXLAN (Virtual Extensible LAN) connects the 5 VMs (3 Galera + 2 PKI app). This overlay virtual network is isolated: no other client or service has access to it. Galera traffic (wsrep replication, IST, SST) transits exclusively through this VXLAN. No network contention, no traffic leaks.

Proxmox VE: hyperconverged virtualization

All VMs are hosted on our clusters. One Proxmox VE cluster per datacenter. Proxmox provides the virtualization layer (KVM/QEMU), shared storage (Ceph or ZFS over iSCSI), SDN networking (VXLAN), and high availability (HA manager). VMs can live migrate between hypervisors.

Nimbus Backup (Proxmox Backup Server)

Daily backups via Nimbus, RDEM Systems' backup solution based on Proxmox Backup Server. Retention across 2 geographically distributed sites (Double Drive PBS) for resilience. Deduplication, compression, client-side encryption. Fast restoration in case of corruption or data loss.

Monitoring via CheckMK

Cluster monitoring via CheckMK: MariaDB metrics (connections, threads, slow queries, wsrep replication), Galera node status (wsrep_cluster_status, wsrep_local_state_comment), system metrics (CPU, RAM, disk, network), alerts configured for node failures or quorum loss.

Sovereign hosting: 3 datacenters in France, AS206014 network

The 3 Galera nodes are hosted in 3 independent datacenters in Île-de-France, operated by RDEM Systems. Network infrastructure operated on the autonomous BGP network AS206014, fully controlled by RDEM Systems.

No transit through a third-party American operator (AWS, GCP, Azure). No dependency on public cloud infrastructure. Your data stays in France, under French jurisdiction, subject only to European law.

This sovereign architecture guarantees native GDPR compliance and protects against the CLOUD Act and FISA 702.

Use case: RDEM Systems PKIaaS

This Galera cluster serves as the database backend for PKIaaS, the PKI (Public Key Infrastructure) operated by RDEM Systems.

The PKI issues and manages X.509 certificates for RDEM Systems clients: TLS certificates for web servers, client certificates for mutual authentication (mTLS), certificates for VPNs, certificates for code signing.

The database stores issued certificates, signature requests (CSR), revocation lists (CRL), intermediate keys, audit metadata. It must be available 24/7: a PKI outage blocks the issuance of new certificates and the publication of CRLs, which can break trust chains.

Galera Cluster ensures that even if an entire datacenter is offline, the PKI continues to operate. The 2 other nodes maintain quorum. The 2 PKI application nodes connect directly to the Galera nodes to distribute read/write queries.

Management with Signal18 Replication Manager

The cluster will be managed by Signal18 Replication Manager, the open source reference tool for automated MariaDB cluster management. Replication Manager monitors node status, detects failures and orchestrates failovers automatically and traceably.

Unlike homegrown scripts or makeshift solutions, Replication Manager provides centralized monitoring with a dedicated web interface, configurable alerts and a complete event history. It's the tool recommended by RDEM Systems for all our MariaDB clusters.

Learn more about Replication Manager and automated MariaDB cluster management.

Planned evolutions

  • Advanced monitoring: dedicated Grafana dashboard with real-time wsrep metrics
  • Load testing and benchmarks: validation of cluster load capacity (TPS, p95/p99 latency)
  • Complete documentation: intervention runbooks, maintenance procedures, failure scenarios

Need a managed Galera cluster for your critical infrastructure?

RDEM Systems can deploy and operate a high-availability MariaDB Galera cluster for your critical applications. Same sovereign architecture, same technical stack, same level of service.

Start your managed MariaDB project

Let's discuss your database needs. Our DBA team advises you on the optimal architecture for your use case.

RDEM Systems SAS — SIREN 820 338 671 — 5 B rue des Noyers, 95300 Pontoise