Node Sizing
When planning the deployment of a simplyblock cluster, it is essential to plan the sizing of the nodes. The sizing requirements are elaborated below, whether deployed on a private or public cloud or inside or outside of Kubernetes.
Sizing Assumptions
The following sizing information is meant for production environments.
Warning
If the sizing document discusses virtual CPUs (vCPU), it means 0.5 physical CPUs. This corresponds to a typical hyper-threaded CPU core x86-64. This also relates to how AWS EC2 cores are measured.
Management Nodes
An appropriately sized management node cluster is required to ensure optimal performance and scalability. The management plane oversees critical functions such as cluster topology management, health monitoring, statistics collection, and automated maintenance tasks.
The following hardware sizing specifications are recommended:
Hardware | |
---|---|
CPU | Minimum 4 vCPUs, plus
|
RAM | Minimum 8 GiB, plus:
|
Disk | Minimum 35 GiB, plus:
|
Node type | Bare metal or virtual machine with a supported Linux distribution |
Number of nodes | For a production environment, a minimum of 3 management nodes is required. |
Storage Nodes
Warning
A storage node is not equal to a physical or virtual host. For optimal performance, at least two storage nodes are deployed on a two socket system (one per NuMA socket), for optimal performance even four storage nodes are recommended (2 per socket).
A suitably sized storage node cluster is required to ensure optimal performance and scalability. Storage nodes are responsible for handling all I/O operations and data services for logical volumes and snapshots.
The following hardware sizing specifications are recommended:
Hardware | |
---|---|
CPU | Minimum 5 vCPU |
RAM | Minimum 4 GiB |
Disk | Minimum 10 GiB free space on boot volume |
Memory Requirements
In addition to the above RAM requirements, the storage node requires additional memory based on the managed storage capacity.
While a certain amount of RAM is pre-reserved for SPDK, another part is dynamically pre-allocated. Users should ensure that the full amount of required RAM is available (reserved) from the system as long as simplyblock is running.
The exact amount of memory is calculated when adding or restarting a node based on two parameters:
- The maximum amount of storage available in the cluster
- The maximum amount of logical volumes which can be created on the node
Unit | Memory Requirement |
---|---|
Fixed amount | 2 GiB |
Per logical volume | 25 MiB |
% of max. utilized capacity on node | 0.05 |
% of NVMe capacity on node | 0.025 |
Info
Example: A node has 10 NVMe devices with 8TB each. The cluster has 3 nodes and total capacity of 240 TB. Logical volumes are equally distributed across nodes, and it is planned to use up to 1,000 logical volumes on each node. Hence, the following formula:
(2 + (0.025 * 1,000) + (0.05 * 240,000 GB / 3) + (0.025 * 80,000 GB) = 64.5 GB
If not enough memory is available, the node will refuse to start. In this case, /proc/meminfo
may be checked for
total, reserved, and available system and huge page memory on a corresponding node.
Info
Part of the memory will be allocated as huge-page memory. In case of a high degree of memory fragmentation, a system may not be able to allocate enough of huge-page memory even if there is enough of system memory available. If the node fails to start-up, a system reboot may ensure enough free memory.
The following command can be executed to temporarily allocate huge pages while the system is already running. It will allocate 8 GiB in huge pages. The number of huge pages must be adjusted depending on the requirements. The Huge Pages Calculator helps with calculating the required number of huge pages.
sudo sysctl vm.nr_hugepages=4096
Since the allocation is temporary, it will disappear after a system reboot. It must be ensured that either the setting is re-applied after each system reboot or persisted to be automatically applied on systemm boot up.
Storage Planning
Simplyblock storage nodes require one or more NVMe devices to provide storage capacity to the distributed storage pool of a storage cluster.
Furthermore, simplyblock storage nodes require one additional NVMe device with less capacity as a journaling device. The journaling device becomes part of the distributed record journal, keeping track of all changes before being persisted into their final position. This helps with write performance and transactional behavior by using a write-ahead log structure and replaying the journal in case of a issue.
Warning
Simplyblock does not work with device partitions or claimed (mounted) devices. It must be ensured that all NVMe devices to be used by simplyblock are unmounted and not busy.
Any partition must be removed from the NVMe devices prior to installing simplyblock. Furthermore, NVMe devices must be low-level formatted with 4KB block size (lbaf: 12). More information can be found in NVMe Low-Level Format.
Info
Secondary nodes don't need NVMe storage disks.
Caching Nodes (K8s only)
In Kubernetes, simplyblock can be configured to deploy caching nodes. These nodes provide a ultra-low latency write-through cache to a disaggregated cluster, improving access latency substantially.
Hardware | |
---|---|
CPU | Minimum 6 vCPU |
RAM | Minimum 4 GiB |