Skip to content

Commit 005662e

Browse files
committed
Support for multiple storage directories (#296)
# Description Fixes #274
1 parent 6715d75 commit 005662e

File tree

27 files changed

+1036
-489
lines changed

27 files changed

+1036
-489
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ All notable changes to this project will be documented in this file.
77
### Added
88

99
- Log aggregation added ([#290]).
10+
- Support for multiple storage directories ([#296]).
1011

1112
### Changed
1213

@@ -21,6 +22,7 @@ All notable changes to this project will be documented in this file.
2122
[#281]: https://github.com/stackabletech/hdfs-operator/pull/281
2223
[#286]: https://github.com/stackabletech/hdfs-operator/pull/286
2324
[#290]: https://github.com/stackabletech/hdfs-operator/pull/290
25+
[#296]: https://github.com/stackabletech/hdfs-operator/pull/296
2426

2527
## [0.6.0] - 2022-11-07
2628

deploy/helm/hdfs-operator/crds/crds.yaml

Lines changed: 152 additions & 134 deletions
Large diffs are not rendered by default.

deploy/manifests/crds.yaml

Lines changed: 152 additions & 134 deletions
Large diffs are not rendered by default.

docs/modules/ROOT/pages/usage.adoc

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,52 @@ dataNodes:
192192

193193
In the above example, all data nodes in the default group will store data (the location of `dfs.datanode.name.dir`) on a `128Gi` volume.
194194

195-
By default, in case nothing is configured in the custom resource for a certain role group, each Pod will have a `1Gi` large local volume mount for the data location.
195+
By default, in case nothing is configured in the custom resource for a certain role group, each Pod will have a `5Gi` large volume mount for the data location.
196+
197+
==== Multiple storage volumes
198+
199+
Datanodes can have multiple disks attached to increase the storage size as well as speed.
200+
They can be of different type, e.g. HDDs or SSDs.
201+
202+
You can configure multiple [PersistentVolumeClaims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) (PVCs) for the datanodes as follows:
203+
204+
[source,yaml]
205+
----
206+
dataNodes:
207+
roleGroups:
208+
default:
209+
config:
210+
resources:
211+
storage:
212+
data: # We need to overwrite the data pvcs coming from the default value
213+
count: 0
214+
my-disks:
215+
count: 3
216+
capacity: 12Ti
217+
hdfsStorageType: Disk
218+
my-ssds:
219+
count: 2
220+
capacity: 5Ti
221+
storageClass: premium-ssd
222+
hdfsStorageType: SSD
223+
----
224+
225+
This will create the following PVCs:
226+
227+
1. `my-disks-hdfs-datanode-default-0` (12Ti)
228+
2. `my-disks-1-hdfs-datanode-default-0` (12Ti)
229+
3. `my-disks-2-hdfs-datanode-default-0` (12Ti)
230+
4. `my-ssds-hdfs-datanode-default-0` (5Ti)
231+
5. `my-ssds-1-hdfs-datanode-default-0` (5Ti)
232+
233+
By configuring and using a dedicated https://kubernetes.io/docs/concepts/storage/storage-classes/[StorageClass] you can configure your HDFS to use local disks attached to Kubernetes nodes.
234+
235+
[NOTE]
236+
====
237+
You might need to re-create the StatefulSet to apply the new PVC configuration because of https://github.com/kubernetes/kubernetes/issues/68737[this Kubernetes issue].
238+
You can delete the StatefulSet using `kubectl delete sts --cascade=false <statefulset>`.
239+
The hdfs-operator will re-create the StatefulSet automatically.
240+
====
196241

197242
=== Resource Requests
198243

rust/crd/src/constants.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,3 +50,13 @@ pub const DFS_HA_NAMENODES: &str = "dfs.ha.namenodes";
5050
// core-site.xml
5151
pub const FS_DEFAULT_FS: &str = "fs.defaultFS";
5252
pub const HA_ZOOKEEPER_QUORUM: &str = "ha.zookeeper.quorum";
53+
54+
pub const STACKABLE_ROOT_DATA_DIR: &str = "/stackable/data";
55+
pub const NAMENODE_ROOT_DATA_DIR: &str = "/stackable/data/namenode";
56+
pub const JOURNALNODE_ROOT_DATA_DIR: &str = "/stackable/data/journalnode";
57+
58+
// Will end up with something like `/stackable/data/<pvc-name>/datanode` e.g. `/stackable/data/data/datanode` and `/stackable/data/data-1/datanode` etc.
59+
// We need one additional level because we don't want users to call their pvc e.g. `hadoop`
60+
// ending up with a location of `/stackable/hadoop/data`
61+
pub const DATANODE_ROOT_DATA_DIR_PREFIX: &str = "/stackable/data/";
62+
pub const DATANODE_ROOT_DATA_DIR_SUFFIX: &str = "/datanode";

0 commit comments

Comments
 (0)