Skip to content

apache/spark-kubernetes-operator

Apache Spark K8s Operator

Artifact Hub GitHub Actions Build License Repo Size

Apache Spark™ K8s Operator is a subproject of Apache Spark and aims to extend K8s resource manager to manage Apache Spark applications via Operator Pattern.

Install Helm Chart

Apache Spark provides a Helm Chart.

helm repo add spark https://apache.github.io/spark-kubernetes-operator helm repo update helm install spark spark/spark-kubernetes-operator

Building Spark K8s Operator

Spark K8s Operator is built using Gradle. To build, run:

./gradlew build -x test

Running Tests

./gradlew build

Build Docker Image

./gradlew buildDockerImage

Install Helm Chart from the source code

helm install spark -f build-tools/helm/spark-kubernetes-operator/values.yaml build-tools/helm/spark-kubernetes-operator/

Run Spark Pi App

$ kubectl apply -f examples/pi.yaml $ kubectl get sparkapp NAME CURRENT STATE AGE pi ResourceReleased 4m10s $ kubectl delete sparkapp/pi

Run Spark Cluster

$ kubectl apply -f examples/prod-cluster-with-three-workers.yaml $ kubectl get sparkcluster NAME CURRENT STATE AGE prod RunningHealthy 10s $ kubectl port-forward prod-master-0 6066 & $ ./examples/submit-pi-to-prod.sh { "action" : "CreateSubmissionResponse", "message" : "Driver successfully submitted as driver-20250628212324-0000", "serverSparkVersion" : "4.0.0", "submissionId" : "driver-20250628212324-0000", "success" : true } $ curl http://localhost:6066/v1/submissions/status/driver-20250628212324-0000/ { "action" : "SubmissionStatusResponse", "driverState" : "FINISHED", "serverSparkVersion" : "4.0.0", "submissionId" : "driver-20250628212324-0000", "success" : true, "workerHostPort" : "10.1.0.88:34643", "workerId" : "worker-20250628212306-10.1.0.88-34643" } $ kubectl delete sparkcluster prod sparkcluster.spark.apache.org "prod" deleted

Run Spark Pi App on Apache YuniKorn scheduler

If you have not yet done so, follow YuniKorn docs to install the latest version:

helm repo add yunikorn https://apache.github.io/yunikorn-release helm repo update helm install yunikorn yunikorn/yunikorn --namespace yunikorn --version 1.6.3 --create-namespace --set embedAdmissionController=false

Submit a Spark app to YuniKorn enabled cluster:

$ kubectl apply -f examples/pi-on-yunikorn.yaml $ kubectl describe pod pi-on-yunikorn-0-driver ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduling 14s yunikorn default/pi-on-yunikorn-0-driver is queued and waiting for allocation Normal Scheduled 14s yunikorn Successfully assigned default/pi-on-yunikorn-0-driver to node docker-desktop Normal PodBindSuccessful 14s yunikorn Pod default/pi-on-yunikorn-0-driver is successfully bound to node docker-desktop Normal TaskCompleted 6s yunikorn Task default/pi-on-yunikorn-0-driver is completed Normal Pulled 13s kubelet Container image "apache/spark:4.0.0" already present on machine Normal Created 13s kubelet Created container spark-kubernetes-driver Normal Started 13s kubelet Started container spark-kubernetes-driver $ kubectl delete sparkapp pi-on-yunikorn sparkapplication.spark.apache.org "pi-on-yunikorn" deleted

Clean Up

Check the existing Spark applications and clusters. If exists, delete them.

$ kubectl get sparkapp No resources found in default namespace. $ kubectl get sparkcluster No resources found in default namespace.

Remove HelmChart and CRDs.

helm uninstall spark kubectl delete crd sparkapplications.spark.apache.org kubectl delete crd sparkclusters.spark.apache.org

Contributing

Please review the Contribution to Spark guide for information on how to get started contributing to the project.

About

Apache Spark Kubernetes Operator

Topics

Resources

License

Apache-2.0, Apache-2.0 licenses found

Licenses found

Apache-2.0
LICENSE
Apache-2.0
LICENSE-binary

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published