You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/docs/components/spark-operator/developer-guide.md
+6-8Lines changed: 6 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,7 +50,7 @@ Dependencies will be automatically downloaded locally to `bin` directory as need
50
50
To see the full list of available targets, run the following command:
51
51
52
52
```bash
53
-
$ make help
53
+
$ make help
54
54
55
55
Usage:
56
56
make <target>
@@ -66,16 +66,14 @@ Development
66
66
go-clean Clean up caches and output.
67
67
go-fmt Run go fmt against code.
68
68
go-vet Run go vet against code.
69
-
lint Run golangci-lint linter.
70
-
lint-fix Run golangci-lint linter and perform fixes.
69
+
go-lint Run golangci-lint linter.
70
+
go-lint-fix Run golangci-lint linter and perform fixes.
71
71
unit-test Run unit tests.
72
72
e2e-test Run the e2e tests against a Kind k8s instance that is spun up.
73
73
74
74
Build
75
75
build-operator Build Spark operator.
76
-
build-sparkctl Build sparkctl binary.
77
-
install-sparkctl Install sparkctl binary.
78
-
clean Clean spark-operator and sparkctl binaries.
76
+
clean Clean binaries.
79
77
build-api-docs Build api documentation.
80
78
docker-build Build docker image with the operator.
81
79
docker-push Push docker image with the operator.
@@ -90,11 +88,11 @@ Helm
90
88
Deployment
91
89
kind-create-cluster Create a kind cluster for integration tests.
92
90
kind-load-image Load the image into the kind cluster.
93
-
kind-delete-custer Delete the created kind cluster.
91
+
kind-delete-cluster Delete the created kind cluster.
94
92
install-crd Install CRDs into the K8s cluster specified in~/.kube/config.
95
93
uninstall-crd Uninstall CRDs from the K8s cluster specified in~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
96
94
deploy Deploy controller to the K8s cluster specified in~/.kube/config.
97
-
undeploy Undeploy controller from the K8s cluster specified in~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
95
+
undeploy Uninstall spark-operator
98
96
99
97
Dependencies
100
98
kustomize Download kustomize locally if necessary.
Copy file name to clipboardExpand all lines: content/en/docs/components/spark-operator/overview/_index.md
+1-8Lines changed: 1 addition & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,8 +24,6 @@ The Kubernetes Operator for Apache Spark currently supports the following list o
24
24
- Supports automatic application re-submission for updated `SparkApplication` objects with updated specification.
25
25
- Supports automatic application restart with a configurable restart policy.
26
26
- Supports automatic retries of failed submissions with optional linear back-off.
27
-
- Supports mounting local Hadoop configuration as a Kubernetes ConfigMap automatically via `sparkctl`.
28
-
- Supports automatically staging local application dependencies to Google Cloud Storage (GCS) via `sparkctl`.
29
27
- Supports collecting and exporting application-level metrics and driver/executor metrics to Prometheus.
30
28
31
29
## Architecture
@@ -37,15 +35,14 @@ The operator consists of:
37
35
- a *submission runner* that runs `spark-submit` for submissions received from the controller,
38
36
- a *Spark pod monitor* that watches for Spark pods and sends pod status updates to the controller,
39
37
- a [Mutating Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) that handles customizations for Spark driver and executor pods based on the annotations on the pods added by the controller,
40
-
- and also a command-line tool named `sparkctl` for working with the operator.
41
38
42
39
The following diagram shows how different components interact and work together.
43
40
44
41
<img src="architecture-diagram.png"
45
42
alt="Spark Operator Architecture Diagram"
46
43
class="mt-3 mb-3 border rounded">
47
44
48
-
Specifically, a user uses the `sparkctl` (or `kubectl`) to create a `SparkApplication` object. The `SparkApplication` controller receives the object through a watcher from the API server, creates a submission carrying the `spark-submit` arguments, and sends the submission to the *submission runner*. The submission runner submits the application to run and creates the driver pod of the application. Upon starting, the driver pod creates the executor pods. While the application is running, the *Spark pod monitor* watches the pods of the application and sends status updates of the pods back to the controller, which then updates the status of the application accordingly.
45
+
Specifically, a user uses the `kubectl` to create a `SparkApplication` object. The `SparkApplication` controller receives the object through a watcher from the API server, creates a submission carrying the `spark-submit` arguments, and sends the submission to the *submission runner*. The submission runner submits the application to run and creates the driver pod of the application. Upon starting, the driver pod creates the executor pods. While the application is running, the *Spark pod monitor* watches the pods of the application and sends status updates of the pods back to the controller, which then updates the status of the application accordingly.
49
46
50
47
## The CRD Controller
51
48
@@ -72,7 +69,3 @@ When the operator decides to restart an application, it cleans up the Kubernetes
72
69
## Mutating Admission Webhook
73
70
74
71
The operator comes with an optional mutating admission webhook for customizing Spark driver and executor pods based on certain annotations on the pods added by the CRD controller. The annotations are set by the operator based on the application specifications. All Spark pod customization needs except for those natively support by Spark on Kubernetes are handled by the mutating admission webhook.
75
-
76
-
## Command-line Tool: Sparkctl
77
-
78
-
[sparkctl](https://github.com/kubeflow/spark-operator/blob/master/cmd/sparkctl/README.md) is a command-line tool for working with the operator. It supports creating a `SparkApplication`object from a YAML file, listing existing `SparkApplication` objects, checking status of a `SparkApplication`, forwarding from a local port to the remote port on which the Spark driver runs, and deleting a `SparkApplication` object. For more details on `sparkctl`, please refer to [README](https://github.com/kubeflow/spark-operator/blob/master/cmd/sparkctl/README.md).
The operator runs Spark applications specified in Kubernetes objects of the `SparkApplication` custom resource type. The most common way of using a `SparkApplication` is store the `SparkApplication` specification in a YAML file and use the `kubectl` command or alternatively the `sparkctl` command to work with the `SparkApplication`. The operator automatically submits the application as configured in a `SparkApplication` to run on the Kubernetes cluster and uses the `SparkApplication` to collect and surface the status of the driver and executors to the user.
7
+
The operator runs Spark applications specified in Kubernetes objects of the `SparkApplication` custom resource type. The most common way of using a `SparkApplication` is store the `SparkApplication` specification in a YAML file and use the `kubectl` command to work with the `SparkApplication`. The operator automatically submits the application as configured in a `SparkApplication` to run on the Kubernetes cluster and uses the `SparkApplication` to collect and surface the status of the driver and executors to the user.
8
8
9
9
As with all other Kubernetes API objects, a `SparkApplication` needs the `apiVersion`, `kind`, and `metadata` fields. For general information about working with manifests, see [object management using kubectl](https://kubernetes.io/docs/concepts/overview/object-management-kubectl/overview/).
Copy file name to clipboardExpand all lines: content/en/docs/components/spark-operator/user-guide/working-with-sparkapplication.md
+2-3Lines changed: 2 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,12 +6,11 @@ weight: 30
6
6
7
7
## Creating a New SparkApplication
8
8
9
-
A `SparkApplication` can be created from a YAML file storing the `SparkApplication` specification using either the `kubectl apply -f <YAML file path>` command or the `sparkctl create <YAML file path>` command. Please refer to the `sparkctl`[README](https://github.com/kubeflow/spark-operator/blob/master/cmd/sparkctl/README.md#create) for usage of the `sparkctl create` command. Once a `SparkApplication` is successfully created, the operator will receive it and submits the application as configured in the specification to run on the Kubernetes cluster. Please note, that `SparkOperator` submits `SparkApplication` in `Cluster` mode only.
9
+
A `SparkApplication` can be created from a YAML file storing the `SparkApplication` specification using either the `kubectl apply -f <YAML file path>` command. Once a `SparkApplication` is successfully created, the operator will receive it and submits the application as configured in the specification to run on the Kubernetes cluster. Please note, that `SparkOperator` submits `SparkApplication` in `Cluster` mode only.
10
10
11
11
## Deleting a SparkApplication
12
12
13
-
A `SparkApplication` can be deleted using either the `kubectl delete <name>` command or the `sparkctl delete <name>` command. Please refer to the `sparkctl`[README](https://github.com/kubeflow/spark-operator/blob/master/cmd/sparkctl/README.md#delete) for usage of the `sparkctl delete`
14
-
command. Deleting a `SparkApplication` deletes the Spark application associated with it. If the application is running when the deletion happens, the application is killed and all Kubernetes resources associated with the application are deleted or garbage collected.
13
+
A `SparkApplication` can be deleted using either the `kubectl delete <name>` command. Deleting a `SparkApplication` deletes the Spark application associated with it. If the application is running when the deletion happens, the application is killed and all Kubernetes resources associated with the application are deleted or garbage collected.
0 commit comments