stackabletech · ohessel · Sep 14, 2020 · Aug 26, 2020 · Sep 14, 2020 · Sep 14, 2020
diff --git a/adr/drafts/ADRx-choose_agent_programming_language.adoc b/adr/drafts/ADRx-choose_agent_programming_language.adoc
@@ -0,0 +1,84 @@
+= Use xxx as programming language for the agent
+Sönke Liebau <soenke.liebau@stackable.de>
+v1.0, 19.08.2020
+:status: draft
+
+* Status: {status}
+* Deciders:
+** Florian Waibel
+** Lars Francke
+** Lukas Menzel
+** Bernd Fondermann
+** Oliver Hessel
+** Sönke Liebau
+* Date: 
+
+Technical Story: https://hosting-jira.1and1.org/browse/DFBAICC-520
+
+== Context and Problem Statement
+
+Which programming language should be used in the implementation of the agent that will manage tool installations on servers?
+
+== Decision Drivers
+
+* The ability to deploy the agent as one binary with no external dependencies
+* Availability of well supported libraries for necessary operations
+** File IO
+** Network IO
+** RPC depends on link:./ADRx-Protocol-to-use-for-communication-between-agent-and-orchestrator.html[ADRx-Protocol to use for communication between agent and orchestrator]
+** SystemD
+* IDE support
+* Debugging options
+
+== Considered Options
+
+* Java
+* Go
+* Rust
+
+== Decision Outcome
+
+Chosen option: "[option 1]", because [justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force force | … | comes out best (see below)].
+
+=== Positive Consequences <!-- optional -->
+
+* [e.g., improvement of quality attribute satisfaction, follow-up decisions required, …]
+* …
+
+=== Negative Consequences <!-- optional -->
+
+* [e.g., compromising quality attribute, follow-up decisions required, …]
+* …
+
+== Pros and Cons of the Options <!-- optional -->
+
+=== Java
+
+* Good, because easy to find developers
+* Good, because team is very familiar with it
+* Bad, because it needs a jvm as dependency and is not deployable as single binary
+** GraalVM has many drawbacks and licensing doubts as it is an Oracle product
+
+=== Go
+
+* Good, because it compiles to a single binary on many platforms
+* Good, because Kubernetes also uses it
+* Good, because there is proper IDE support with debugging
+* Bad, because a new language to learn for many team members
+* Bad, because missing generics may be an issue and create less than readable code
+* Bad, because it is still a garbage collected language
+* Todo: check library availability
+
+=== Rust
+
+* Good, because it compiles to a single binary on many platforms
+* Good, because no garbage collection
+* Good, because it enforces a high level of security
+* Bad, because a new language to learn for many team members
+* Bad, because potentially very touch to find developers - arguably not a real drawback as people will need to be willing to learn something new anyway
+* Todo: check library availability
+
+== Links <!-- optional -->
+
+* [Link type] [Link to ADR] <!-- example: Refined by [ADR-0005](0005-example.md) -->
+* … <!-- numbers of links can vary -->
diff --git a/adr/drafts/ADRx-choose_authorization_engine.adoc b/adr/drafts/ADRx-choose_authorization_engine.adoc
@@ -0,0 +1,102 @@
+= Choose Authorization Engine
+Doc Writer <doc.writer@asciidoctor.org>
+v0.1, dd.mm.yyyy
+:status: draft
+
+* Status: {status}
+* Deciders:
+** Florian Waibel
+** Lars Francke
+** Lukas Menzel
+** Bernd Fondermann
+** Oliver Hessel
+** Sönke Liebau
+* Date: xxx
+
+== Context and Problem Statement
+
+We need some form of authorization engine both for the products that are deployed via our stack as well as for our internal apis.
+This engine should have the ability to express universal access controls, as it will need to be adapted to many different end products:
+
+* Stackable
+* Hadoop
+* Kafka
+* Airflow
+* Elasticsearch
+* ...
+
+Depending on which option is chosen, there is a second, implicit, decision that is taken as part of this record: whether or not to include an identity provider.
+Keycloak and Ranger both offer user management on top of authoriztion, whereas Open Policy Agent is purely an authorization engine.
+
+I'm not sure if we need to split this decision out into a separate ADR, but I suspect that it may make sense.
+If Open Policy Agent is chosen as part of this ADR, at some point we need to decide whether we also need an identity provider and if so, which one we should pick.
+
+
+== Decision Drivers <!-- optional -->
+
+* Availability of plugins for initial components or expected effort for implementation
+* Flexibility of rule engine
+
+== Considered Options
+
+* Ranger
+* Open Policy Agent
+* Keycloak
+
+
+== Decision Outcome
+
+Chosen option: "[option 1]", because [justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force force | … | comes out best (see below)].
+
+=== Positive Consequences
+
+* [e.g., improvement of quality attribute satisfaction, follow-up decisions required, …]
+* …
+
+=== Negative Consequences
+
+* [e.g., compromising quality attribute, follow-up decisions required, …]
+* …
+
+== Pros and Cons of the Options
+
+=== Ranger
+
+https://ranger.apache.org/[Ranger] is the de facto default authorization tool in the big data ecosystem.
+It offers existing integrations with a variety of tools and is used by the Cloudera offer as central access management component.
+
+* Good, because most necessary integrations already exist
+* Good, existing know how applies
+* Good, because it offers id provider functionality
+* Bad, because adding new tools is complex
+* Bad, because objects to authorize on need to be defined in code (see Open Policy Agent for comparison)
+* Bad, because user synchronization mechanisms are fairly limited
+
+=== Open Policy Agent
+
+https://www.openpolicyagent.org/[Open Policy Agent] is a universal authorization engine that has become popular in the Kubernetes (but not exclusively) environment lately.
+OPA defines ACLs in an abstract language called https://www.openpolicyagent.org/docs/latest/policy-language/[Rego] which allows keeping
+
+
+
+* Good, because relatively small effort to implement new tools
+* Good, because very flexible system to define ACLs
+* Bad, because no real HA concept
+* Bad, because only one authorizer (Kafka) already implemented
+* Bad, because would require additional identity provider
+
+=== Keycloak
+
+https://www.keycloak.org/[Keycloak] is based on a Wildfly application server and probably the most fully featured alternative of the ones discussed.
+It allows integration with LDAP and AD, offers authorization, a clustered mode for high availability and much more.
+
+* Good, because gives a high degree of flexibility in adapting customers id solutions
+* Good, because well established and widely used (GAIA-X, SCS)
+* Bad, because no existing authorization plugins
+* Bad, because objects to authorize on need to be defined in code (see Open Policy Agent for comparison)
+
+
+== Links
+
+* [Link type] [Link to ADR] <!-- example: Refined by [ADR-0005](0005-example.md) -->
+* … <!-- numbers of links can vary -->
diff --git a/adr/drafts/ADRx-choose_orchestrator_storage_backend.adoc b/adr/drafts/ADRx-choose_orchestrator_storage_backend.adoc
@@ -0,0 +1,84 @@
+= Use xxx as storage backend for the orchestartor
+Sönke Liebau <soenke.liebau@stackable.de>
+v0.1, 19.08.2020
+:status: draft
+
+* Status: {status}
+* Deciders:
+** Florian Waibel
+** Lars Francke
+** Lukas Menzel
+** Bernd Fondermann
+** Oliver Hessel
+** Sönke Liebau
+* Date:
+
+Technical Story: [description | ticket/issue URL] <!-- optional -->
+
+== Context and Problem Statement
+
+The orchestrator will need some form of persistent storage backend, for which a decision on the technology to be used has to be taken.
+Our usage of this storage will most probably be extremely simple, even if a SQL database is chosen, the expectation is that it will be used fairly similar to a key value storage.
+
+== Decision Drivers
+
+* Availability of libraries for chosen programming language for the orchestrator
+* How established is the backend at potential customers, will we need to deploy it?
+
+
+== Considered Options
+
+* etcd
+* Zookeeper
+* SQL Database
+
+== Decision Outcome
+
+
+
+=== Positive Consequences
+
+*
+
+=== Negative Consequences
+
+*
+== Pros and Cons of the Options
+
+=== etcd
+
+https://etcd.io/
+
+* Good, because etcd is used by Kubernetes
+** Likelyhood that it is already deployed
+** Expertise with etcd by Kubernetes admins can be reused
+* Good, because it offers watch functionality
+* Good, because it offers consensus mechanisms
+* Bad, because it has a hard size limit
+* Bad, because it does not work well with large numbers of requests
+
+=== Zookeeper
+
+https://zookeeper.apache.org/
+
+* Good, because it is well established and unterstood
+* Good, because it offers watch functionality
+* Good, because it offers consensus mechanisms
+* Bad, because it offers no real benefit over etcd
+* Bad, because it is known to have trouble with high volume of changes
+
+=== SQL Database
+
+* Good, because expertise and processes for some form of database will be present at pretty much any customer
+** Backup
+** HA
+* Good, because deploying in integrated test/dev environment is easy with sqlite
+* Bad, because we would need to potentially support multiple database vendors
+** Postgres
+** MS Sql
+** Oracle
+** …
+
+
+
+== Links
diff --git a/adr/drafts/ADRx-decide_reuse_of_operators.adoc b/adr/drafts/ADRx-decide_reuse_of_operators.adoc
@@ -0,0 +1,70 @@
+= Allow Reuse of Existing Kubernetes Operators
+Sönke Liebau <soenke.liebau@stackable.de>
+v0.1, 19.08.2020
+:status: draft
+
+* Status: {status}
+* Deciders:
+** Florian Waibel
+** Lars Francke
+** Lukas Menzel
+** Bernd Fondermann
+** Oliver Hessel
+** Sönke Liebau
+* Date:
+
+
+== Context and Problem Statement
+
+For some of the tools we plan to integrate there are existing operators that deploy these tools on Kubernetes.
+Most notably these tools are:
+
+* Spark
+* Kafka
+
+Some implementation effort may be avoided by reusing these operators instead of recreating the tool-specific functionality that is already implemented.
+Since these operators are designed to work with Kubernetes and thus exclusively focused on containers some translation of data structures and processes would be necessary.
+
+== Decision Drivers <!-- optional -->
+
+* Keeping the implementation effort as low as realistically possible
+* Keeping compatibility with Kubernetes as far as possible to ease a later move towards Kubernetes deployments
+* Avoid hard dependencies on external projects that may force us to fork in case they break compatibility with us
+
+== Considered Options
+
+* Allow reuse of Kubernetes operators (would need to be decided individually for every tool)
+* Don't reuse operators
+
+== Decision Outcome
+
+Chosen option: "[option 1]", because [justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force force | … | comes out best (see below)].
+
+=== Positive Consequences
+
+* [e.g., improvement of quality attribute satisfaction, follow-up decisions required, …]
+* …
+
+=== Negative Consequences
+
+* [e.g., compromising quality attribute, follow-up decisions required, …]
+* …
+
+== Pros and Cons of the Options
+
+=== Allow reuse of Kubernetes operators
+
+* Good, because it saves implementation effort
+* Good, because this forces us to consider Kubernetes compatibility repeatedly
+* Bad, because we create a dependency on another project that may at some point break compatibility
+* Bad, because we need to adapt to interfaces that have been designed specifically with containers in mind
+
+=== Don't reuse operators
+
+* Good, because it allows us to build our operators the way that works best for us
+* Good, because we do not depend on the quality of external projects that may have implemented partial functionality (i.e. security)
+* Bad, because we repeat work that has already been done
+
+== Links
+
+* https://kubernetes.io/docs/concepts/extend-kubernetes/operator/[Operator pattern description]