stackabletech · fhennig · Jun 23, 2022 · Jun 16, 2022 · Jun 16, 2022 · Jun 16, 2022
diff --git a/...ributor/images/service_discovery_arch.png → ...oncepts/images/service_discovery_arch.png b/...ributor/images/service_discovery_arch.png → ...oncepts/images/service_discovery_arch.png
diff --git a/modules/concepts/nav.adoc b/modules/concepts/nav.adoc
@@ -1,2 +1,3 @@
-* Concepts
+* xref:concepts:index.adoc[]
+** xref:service_discovery.adoc[]
 ** xref:s3.adoc[]
diff --git a/modules/concepts/pages/index.adoc b/modules/concepts/pages/index.adoc
@@ -0,0 +1,3 @@
+= Concepts
+
+This section of the documentation is intended to be read to gain a deeper understanding of the bigger picture and architectural design of the platform.
diff --git a/modules/concepts/pages/s3.adoc b/modules/concepts/pages/s3.adoc
@@ -41,7 +41,7 @@ spec:
 // ---------- Referencing -------------
 
 S3Bucket(s) reference S3Connection(s) objects. Both types of objects can be referenced by other resources. For example in a DruidCluster you can specify a bucket for deep storage and an S3Connection for data ingestion.
-S3 connection objects can be defined in a standalone fashion or they can be inlined into a bucket object. Similarly, a bucket can be defined in a standalone object or inlined into an enclosing object.
+S3Connection objects can be defined in a standalone fashion or they can be inlined into a bucket object. Similarly a bucket can be defined in a standalone object or inlined into an enclosing object.
 
 [excalidraw,s3-cluster-bucket-connection-reference,svg,width=70%]
 ----

diff --git a/modules/concepts/pages/service_discovery.adoc b/modules/concepts/pages/service_discovery.adoc
@@ -0,0 +1,93 @@
+= Service discovery
+
+Most products on the Stackable platform (and possibly other software in your stack) require other software to run or to be made useful -- i.e. an analytics tool will delegate the storing of data to other applications and instead interface with various data storage solutions. The Stackable platform uses _service discovery_ to enable users and operators to easily connect different products together. There are three types of uses for the service discovery mechanism:
+
+* A Stackable operated product instance requires a connection to another product to run. For example: NiFi requires a ZooKeeper to run.
+* You want to connect non-Stackable operated software to a Stackable operated product instance. For example: You wrote your own analytics tool and want to read data from Trino.
+* You want to offer a product instance not operated with a Stackable operator as a dependency to Stackable operated product instances. For example: You already run a Druid cluster and want to connect it to a Superset instance operated with a Stackable operator.
+
+This page explains the general mechanism of service discovery and how this mechanism works in the contexts described above.
+
+== The service discovery ConfigMap
+
+In the Stackable platform, every product instance is defined by a resource with a name, for example a ZooKeeperCluster:
+
+[source,yaml]
+----
+apiVersion: zookeeper.stackable.tech/v1alpha1
+kind: ZookeeperCluster
+metadata:
+ name: simple-zk
+spec:
+ ...
+----
+
+The operator reads the resource and creates the necessary pods and services to get the instance running. The operator is aware of the interfaces and connections that may be consumed by other products and it also knows all the details of the actual running processes.
+
+The operator creates a ConfigMap with the same name as the product instance, in the same namespace. Inside of the ConfigMap is information about how to connect to the product instance. For example for a ZooKeeper cluster named simple-zk the Stackable ZooKeeper operator creates a ConfigMap similar to this:
+
+[source,yaml]
+----
+apiVersion: v1
+metadata:
+ name: simple-zk
+data:
+ ZOOKEEPER: simple-zk-server-default-0.simple-zk-server-default.default.svc.cluster.local:2181,simple-zk-server-default-1.simple-zk-server-default.default.svc.cluster.local:2181
+----
+
+The information needed to connect can be a string like above, for example a JDBC connect string: `jdbc:postgresql://localhost:12345`. But a ConfigMap can also contain whole configuration files which can then be mounted into a Pod. This is the case for xref:hdfs::discovery.adoc[HDFS], where the `core-site.xml` and `hdfs-site.xml` files are put into the discovery ConfigMap.
+
+*The ConfigMap always has the same name as cluster resource.* Which allows Stackable operators as well as users to look up service connection information simply by the name of the product instance they want to connect to.
+
+== Usage of the service discovery ConfigMap
+
+Above three use cases for service discovery ConfigMap were outlined. They are described in more detail below.
+
+=== Stackable internal usage
+
+Stackable operators use the discovery ConfigMap to automatically connect to service dependencies. Hbase requires HDFS to run. With an HdfsCluster defined as such:
+
+[source,yaml]
+----
+apiVersion: hdfs.stackable.tech/v1alpha1
+kind: HdfsCluster
+metadata:
+ name: simple-hdfs
+spec:
+ ...
+----
+The HDFS instance is referenced in HBase like this:
+
+[source,yaml]
+----
+apiVersion: hbase.stackable.tech/v1alpha1
+kind: HbaseCluster
+metadata:
+ name: simple-hbase
+spec:
+ hdfsConfigMapName: simple-hdfs
+ ...
+----
+
+With the HdfsCluster name simple-hdfs, the HBase Operator looks up the discovery ConfigMap for the simple-hdfs HdfsCluster, retrieves the information it needs to configure Hbase and configures the simple-hbase instance.
+
+=== Connect third-party products
+
+You can connect your own products to Stackable-operated product instances. Just use the name of the instance to retrieve the ConfigMap and use the information in there to connect your own service. You can find links to these documentation pages below in the <<whats-next>> section.
+
+=== Provide custom dependencies
+
+It is not uncommon to already have some core software running in your stack, such as HDFS. Looking at xref:hdfs::discovery.adoc[the discovery documentation for HDFS], you can see that the discovery ConfigMap for HDFS contains the `core-site.xml` and `hdfs-site-xml` files.
+
+If you are already operating an HDFS instance, you can simply provide a ConfigMap containing these files. You can then use the name of this ConfigMap in the configuration of other products, such as HBase.
+
+[#whats-next]
+== What's next
+
+Consult discovery ConfigMap documentation for specific products:
+
+* xref:hdfs::discovery.adoc[Apache Hadoop HDFS]
+* xref:hive::discovery.adoc[Apache Hive]
+* xref:kafka::discovery.adoc[Apache Kafka]
+* xref:opa::discovery.adoc[OPA]
+* xref:zookeeper::discovery.adoc[Apache ZooKeeper]
diff --git a/modules/contributor/pages/service_discovery.adoc b/modules/contributor/pages/service_discovery.adoc
@@ -1,34 +1,11 @@
 :source-highlighter: highlight.js
 :highlightjs-languages: rust
 
-= Service Discovery
+= Service Discovery Implementation Guidelines
 
-== Introduction
+For a conceptual overview of service discovery, consult the xref:concepts:service_discovery.adoc[service discovery concept page].
 
-Several products deployed by the Stackable platform depend on other (Stackable) products. This could be a product that requires an external database, high availability support or synchronization.
-
-In order to programmatically resolve this dependency, the Stackable platform uses _service discovery_. A Stackable operator is aware of interfaces and connections that have to be exposed and may be consumed by other operators to configure their products. These interfaces or connections are usually referred to as _connection string_.
-
-As a real world example, the Stackable Operator for Apache Kafka has to configure Kafka brokers with an Apache ZooKeeper connection string in order to store and share information about e.g. Kafka topics. This connection string is provided by the Stackable Operator for Apache ZooKeeper, which is aware of all the pods and services related to ZooKeeper.
-
-== Examples for connection strings
-
-- JDBC SQL connection strings: `jdbc:postgresql://localhost:12345`
-- thrift protocol: `thrift://localhost:12345`
-- spark protocol: `spark://master:7077`
-- REST API: `\http://localhost:8080`
-- HDFS: `hdfs://localhost:12345`
-- ZooKeeper ZNode: `host1:2181,host2:2181/my-chroot`
-
-== Concepts of Service Discovery
-
-=== Architecture
-
-The Operator that provides service discovery writes a `ConfigMap` with all necessary information about its exposed services. Each service has its own entry in the `ConfigMap` as can be seen with the `ZOOKEEPER` entry below:
-
-image::service_discovery_arch.png[Service Discovery]
-
-=== Best practices
+== Best practices
 
 ==== Exposing config maps for service discovery
 

diff --git a/supplemental-ui/partials/navbar.hbs b/supplemental-ui/partials/navbar.hbs
@@ -1,6 +1,6 @@
 <a class="navbar-sub-item" href="{{{ relativize "/home/index.html" }}}">Home</a>
 <a class="navbar-sub-item" href="{{{ relativize "/home/getting_started.html" }}}">Getting Started</a>
-<a class="navbar-sub-item" href="{{{ relativize "/home/concepts/s3.html" }}}">Concepts</a>
+<a class="navbar-sub-item" href="{{{ relativize "/home/concepts/index.html" }}}">Concepts</a>
 <a class="navbar-sub-item" href="{{{ relativize "/home/tutorials/end-to-end_data_pipeline_example.html" }}}">Tutorials</a>
 <a class="navbar-sub-item" href="{{{ relativize "/stackablectl/stable/index.html" }}}">stackablectl</a>
 <div class="navbar-sub-item drop-down">
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		= Concepts

		This section of the documentation is intended to be read to gain a deeper understanding of the bigger picture and architectural design of the platform.