Skip to content
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@

## [Unreleased]

### Added

- Cpu and memory limits are now configurable ([#245]).

[#245]: https://github.com/stackabletech/hbase-operator/pull/245

## [0.4.0] - 2022-09-06

### Changed
Expand Down
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

203 changes: 203 additions & 0 deletions deploy/crd/hbasecluster.crd.yaml

Large diffs are not rendered by default.

854 changes: 854 additions & 0 deletions deploy/helm/hbase-operator/crds/crds.yaml

Large diffs are not rendered by default.

854 changes: 854 additions & 0 deletions deploy/manifests/crds.yaml

Large diffs are not rendered by default.

69 changes: 69 additions & 0 deletions docs/modules/ROOT/pages/usage.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,75 @@ For a full list of configuration options we refer to the HBase https://hbase.apa

// CLI overrides are also not implemented

=== Storage for data volumes

The HBase Operator currently does not support any https://kubernetes.io/docs/concepts/storage/persistent-volumes[PersistentVolumeClaims].

=== Memory requests

You can request a certain amount of memory for each individual role or role group, where the role group configuration overrides existing parts (otherwise they will be inherited) from the role configuration:

[source,yaml]
----
regionServers:
config:
resources:
memory:
limit: '2Gi'
roleGroups:
default:
config:
resources:
memory:
limit: '3Gi'
----

In this example, each HBase region server in the `default` group will have a maximum of `3Gi` (gigabytes) of memory. To be more precise, these memory limits apply to the container running HBase but not to any sidecar containers that are part of the pod.

Not specifying any memory limits in the role group will use the provided `2Gi` from the role. If the role is not specified it will use defaults as demonstrated below.

Setting this property will also automatically set the maximum Java heap size for the corresponding process to 80% of the available memory. Be aware that if the memory constraint is too low, the cluster might fail to start. If pods terminate with an 'OOMKilled' status and the cluster doesn't start, try increasing the memory limit.

For more details regarding Kubernetes memory requests and limits see: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/[Assign Memory Resources to Containers and Pods].

=== CPU requests

Similarly to memory resources, you can also configure CPU limits, as shown below:

[source,yaml]
----
regionServers:
roleGroups:
default:
config:
resources:
cpu:
max: '500m'
min: '250m'
----

=== Defaults

If nothing is specified, the operator will automatically set the following default values for resources:

[source,yaml]
----
regionServers:
roleGroups:
default:
config:
resources:
cpu:
min: '200m'
max: "4"
memory:
limit: '2Gi'
----

WARNING: The default values are _most likely_ not sufficient to run a proper cluster in production. Please adapt according to your requirements.

For more details regarding Kubernetes CPU limits see: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/[Assign CPU Resources to Containers and Pods].

== Phoenix

The Apache Phoenix project provides the ability to interact with HBase with JBDC using familiar SQL-syntax. The Phoenix dependencies are bundled with the Stackable HBase image and do not need to be installed separately (client components will need to ensure that they have the correct client-side libraries available). Information about client-side installation can be found https://phoenix.apache.org/installation.html[here].
Expand Down
7 changes: 7 additions & 0 deletions docs/modules/getting_started/examples/code/hbase.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ spec:
regionServers:
roleGroups:
default:
config:
resources:
cpu:
min: 300m
max: "3"
memory:
limit: 3Gi
replicas: 1
restServers:
roleGroups:
Expand Down
5 changes: 4 additions & 1 deletion rust/crd/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@ repository = "https://github.com/stackabletech/hbase-operator"
version = "0.5.0-nightly"

[dependencies]
stackable-operator = { git = "https://github.com/stackabletech/operator-rs.git", tag = "0.24.0" }

serde = "1.0"
serde_json = "1.0"
stackable-operator = { git = "https://github.com/stackabletech/operator-rs.git", tag = "0.24.0" }
snafu = "0.7"
strum = { version = "0.24", features = ["derive"] }
tracing = "0.1"
85 changes: 79 additions & 6 deletions rust/crd/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
use serde::{Deserialize, Serialize};
use stackable_operator::kube::runtime::reflector::ObjectRef;
use stackable_operator::kube::CustomResource;
use stackable_operator::product_config_utils::{ConfigError, Configuration};
use stackable_operator::role_utils::{Role, RoleGroupRef};
use stackable_operator::schemars::{self, JsonSchema};
use snafu::Snafu;
use stackable_operator::{
commons::resources::{CpuLimits, MemoryLimits, NoRuntimeLimits, Resources},
config::merge::Merge,
k8s_openapi::apimachinery::pkg::api::resource::Quantity,
kube::{runtime::reflector::ObjectRef, CustomResource},
product_config_utils::{ConfigError, Configuration},
role_utils::{Role, RoleGroupRef},
schemars::{self, JsonSchema},
};
use std::collections::BTreeMap;
use strum::{Display, EnumIter, EnumString};

Expand All @@ -20,6 +25,7 @@ pub const HBASE_REST_OPTS: &str = "HBASE_REST_OPTS";
pub const HBASE_CLUSTER_DISTRIBUTED: &str = "hbase.cluster.distributed";
pub const HBASE_ROOTDIR: &str = "hbase.rootdir";
pub const HBASE_ZOOKEEPER_QUORUM: &str = "hbase.zookeeper.quorum";
pub const HBASE_HEAPSIZE: &str = "HBASE_HEAPSIZE";

pub const HBASE_UI_PORT_NAME: &str = "ui";
pub const METRICS_PORT_NAME: &str = "metrics";
Expand All @@ -31,6 +37,14 @@ pub const HBASE_REGIONSERVER_UI_PORT: i32 = 16030;
pub const HBASE_REST_PORT: i32 = 8080;
pub const METRICS_PORT: i32 = 8081;

pub const JVM_HEAP_FACTOR: f32 = 0.8;

#[derive(Snafu, Debug)]
pub enum Error {
#[snafu(display("Unknown Hbase role found {role}. Should be one of {roles:?}"))]
UnknownHbaseRole { role: String, roles: Vec<String> },
}

#[derive(Clone, CustomResource, Debug, Default, Deserialize, JsonSchema, PartialEq, Serialize)]
#[kube(
group = "hbase.stackable.tech",
Expand Down Expand Up @@ -113,13 +127,34 @@ impl HbaseRole {
}
}

#[derive(Clone, Debug, Default, Deserialize, Eq, JsonSchema, PartialEq, Serialize)]
#[derive(Clone, Debug, Default, Deserialize, Eq, Merge, JsonSchema, PartialEq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct HbaseStorageConfig {}

#[derive(Clone, Debug, Default, Deserialize, JsonSchema, PartialEq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct HbaseConfig {
#[serde(default, skip_serializing_if = "Option::is_none")]
pub hbase_rootdir: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub hbase_opts: Option<String>,
pub resources: Option<Resources<HbaseStorageConfig, NoRuntimeLimits>>,
}

impl HbaseConfig {
fn default_resources() -> Resources<HbaseStorageConfig, NoRuntimeLimits> {
Resources {
cpu: CpuLimits {
min: Some(Quantity("200m".to_owned())),
max: Some(Quantity("4".to_owned())),
},
memory: MemoryLimits {
limit: Some(Quantity("2Gi".to_owned())),
runtime_limits: NoRuntimeLimits {},
},
storage: HbaseStorageConfig {},
}
}
}

impl Configuration for HbaseConfig {
Expand Down Expand Up @@ -228,4 +263,42 @@ impl HbaseCluster {
.unwrap_or("/hbase")
.to_string()
}

/// Retrieve and merge resource configs for role and role groups
pub fn resolve_resource_config_for_role_and_rolegroup(
&self,
role: &HbaseRole,
rolegroup_ref: &RoleGroupRef<HbaseCluster>,
) -> Option<Resources<HbaseStorageConfig, NoRuntimeLimits>> {
// Initialize the result with all default values as baseline
let conf_defaults = HbaseConfig::default_resources();

let role = match role {
HbaseRole::Master => self.spec.masters.as_ref()?,
HbaseRole::RegionServer => self.spec.region_servers.as_ref()?,
HbaseRole::RestServer => self.spec.rest_servers.as_ref()?,
};

// Retrieve role resource config
let mut conf_role: Resources<HbaseStorageConfig, NoRuntimeLimits> =
role.config.config.resources.clone().unwrap_or_default();

// Retrieve rolegroup specific resource config
let mut conf_rolegroup: Resources<HbaseStorageConfig, NoRuntimeLimits> = role
.role_groups
.get(&rolegroup_ref.role_group)
.and_then(|rg| rg.config.config.resources.clone())
.unwrap_or_default();

// Merge more specific configs into default config
// Hierarchy is:
// 1. RoleGroup
// 2. Role
// 3. Default
conf_role.merge(&conf_defaults);
conf_rolegroup.merge(&conf_role);

tracing::debug!("Merged resource config: {:?}", conf_rolegroup);
Some(conf_rolegroup)
}
}
52 changes: 49 additions & 3 deletions rust/operator-binary/src/hbase_controller.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,12 @@
use crate::{discovery::build_discovery_configmap, rbac};
use snafu::{OptionExt, ResultExt, Snafu};
use stackable_hbase_crd::{
HbaseCluster, HbaseConfig, HbaseRole, APP_NAME, HBASE_ENV_SH, HBASE_MASTER_PORT,
HBASE_REGIONSERVER_PORT, HBASE_REST_PORT, HBASE_SITE_XML, HBASE_ZOOKEEPER_QUORUM,
HbaseCluster, HbaseConfig, HbaseRole, HbaseStorageConfig, APP_NAME, HBASE_ENV_SH,
HBASE_HEAPSIZE, HBASE_MASTER_PORT, HBASE_REGIONSERVER_PORT, HBASE_REST_PORT, HBASE_SITE_XML,
HBASE_ZOOKEEPER_QUORUM, JVM_HEAP_FACTOR,
};
use stackable_operator::commons::resources::{NoRuntimeLimits, Resources};
use stackable_operator::memory::{to_java_heap_value, BinaryMultiple};
use stackable_operator::{
builder::{
ConfigMapBuilder, ContainerBuilder, ObjectMetaBuilder, PodBuilder,
Expand All @@ -31,6 +34,7 @@ use stackable_operator::{
};
use std::{
collections::{BTreeMap, HashMap},
str::FromStr,
sync::Arc,
time::Duration,
};
Expand Down Expand Up @@ -137,6 +141,20 @@ pub enum Error {
name: String,
source: stackable_operator::error::Error,
},
#[snafu(display("could not parse Hbase role [{role}]"))]
UnidentifiedHbaseRole {
source: strum::ParseError,
role: String,
},
#[snafu(display("failed to resolve and merge resource config for role and role group"))]
FailedToResolveResourceConfig,
#[snafu(display("invalid java heap config - missing default or value in crd?"))]
InvalidJavaHeapConfig,
#[snafu(display("failed to convert java heap config to unit [{unit}]"))]
FailedToConvertJavaHeap {
source: stackable_operator::error::Error,
unit: String,
},
}

type Result<T, E = Error> = std::result::Result<T, E>;
Expand Down Expand Up @@ -210,20 +228,30 @@ pub async fn reconcile_hbase(hbase: Arc<HbaseCluster>, ctx: Arc<Ctx>) -> Result<
})?;

for (role_name, group_config) in validated_config.iter() {
let hbase_role = HbaseRole::from_str(role_name).context(UnidentifiedHbaseRoleSnafu {
role: role_name.to_string(),
})?;
for (rolegroup_name, rolegroup_config) in group_config.iter() {
let rolegroup = hbase.server_rolegroup_ref(role_name, rolegroup_name);

let resources = hbase
.resolve_resource_config_for_role_and_rolegroup(&hbase_role, &rolegroup)
.context(FailedToResolveResourceConfigSnafu)?;

let rg_service = build_rolegroup_service(&hbase, &rolegroup, rolegroup_config)?;
let rg_configmap = build_rolegroup_config_map(
&hbase,
&rolegroup,
rolegroup_config,
&zk_connect_string,
&resources,
)?;
let rg_statefulset = build_rolegroup_statefulset(
&hbase,
&rolegroup,
rolegroup_config,
&rbac_sa.name_unchecked(),
&resources,
)?;
cluster_resources
.add(client, &rg_service)
Expand Down Expand Up @@ -304,6 +332,7 @@ fn build_rolegroup_config_map(
rolegroup: &RoleGroupRef<HbaseCluster>,
rolegroup_config: &HashMap<PropertyNameKind, BTreeMap<String, String>>,
zk_connect_string: &str,
resources: &Resources<HbaseStorageConfig, NoRuntimeLimits>,
) -> Result<ConfigMap, Error> {
let mut hbase_site_config = rolegroup_config
.get(&PropertyNameKind::File(HBASE_SITE_XML.to_string()))
Expand All @@ -320,11 +349,26 @@ fn build_rolegroup_config_map(
.map(|(k, v)| (k, Some(v)))
.collect::<BTreeMap<_, _>>();

let hbase_env_config = rolegroup_config
let mut hbase_env_config = rolegroup_config
.get(&PropertyNameKind::File(HBASE_ENV_SH.to_string()))
.cloned()
.unwrap_or_default();

let heap_in_mebi = to_java_heap_value(
resources
.memory
.limit
.as_ref()
.context(InvalidJavaHeapConfigSnafu)?,
JVM_HEAP_FACTOR,
BinaryMultiple::Mebi,
)
.context(FailedToConvertJavaHeapSnafu {
unit: BinaryMultiple::Mebi.to_java_memory_unit(),
})?;

hbase_env_config.insert(HBASE_HEAPSIZE.to_string(), format!("{}m", heap_in_mebi));

let mut builder = ConfigMapBuilder::new();

builder
Expand Down Expand Up @@ -416,6 +460,7 @@ fn build_rolegroup_statefulset(
rolegroup_ref: &RoleGroupRef<HbaseCluster>,
_rolegroup_config: &HashMap<PropertyNameKind, BTreeMap<String, String>>,
sa_name: &str,
resources: &Resources<HbaseStorageConfig, NoRuntimeLimits>,
) -> Result<StatefulSet> {
let hbase_version = hbase_version(hbase)?;

Expand Down Expand Up @@ -513,6 +558,7 @@ fn build_rolegroup_statefulset(
.add_volume_mount("hbase-config", HBASE_CONFIG_TMP_DIR)
.add_volume_mount("hdfs-discovery", HDFS_DISCOVERY_TMP_DIR)
.add_container_ports(ports)
.resources(resources.clone().into())
.startup_probe(startup_probe)
.liveness_probe(liveness_probe)
.readiness_probe(readiness_probe)
Expand Down
12 changes: 12 additions & 0 deletions tests/templates/kuttl/resources/00-assert.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
apiVersion: kuttl.dev/v1beta1
kind: TestAssert
timeout: 600
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test-zk-server-default
status:
readyReplicas: 1
replicas: 1
Loading