Skip to content

Commit 9d0662d

Browse files
authored
Add questions on SRE and Chaos Engineering (bregman-arie#325)
1 parent 407533d commit 9d0662d

File tree

2 files changed

+44
-2
lines changed

2 files changed

+44
-2
lines changed

topics/chaos_engineering/README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,18 @@ According to [Gremlin](gremlin.com) there are three steps:
2828

2929
The process then repeats itself either with same scenario or a new one.
3030

31+
</b></details>
32+
33+
<details>
34+
<summary>Cite a few tools used to operate Chaos exercises</summary><br><b>
35+
36+
- AAWS Fault Injection Simulator: inject failures in AWS resources
37+
- Azure Chaos Studio: inject failures in Azure resources
38+
- Chaos Monkey: one of the most famous tools to orchestrate Chaos on diverse Cloud providers
39+
- Litmus - A Framework for Kubernetes
40+
- Chaos Mesh: for Cloud Kubernetes platforms
41+
42+
43+
See an extensive list [here](https://github.com/dastergon/awesome-chaos-engineering)
44+
3145
</b></details>

topics/devops/README.md

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -393,10 +393,10 @@ This situation might lead to bugs which hard to identify and reproduce.
393393
<details>
394394
<summary>Explain Declarative and Procedural styles. The technologies you are familiar with (or using) are using procedural or declarative style?</summary><br><b>
395395

396-
Declarative - You write code that specifies the desired end state<br><b>
396+
Declarative - You write code that specifies the desired end state<br>
397397
Procedural - You describe the steps to get to the desired end state
398398

399-
Declarative Tools - Terraform, Puppet, CloudFormation, Ansible<br><b>
399+
Declarative Tools - Terraform, Puppet, CloudFormation, Ansible<br>
400400
Procedural Tools - Chef
401401

402402
To better emphasize the difference, consider creating two virtual instances/servers.
@@ -506,3 +506,31 @@ Google: "Monitoring is one of the primary means by which service owners keep tra
506506

507507
Read more about it [here](https://sre.google/sre-book/introduction)
508508
</b></details>
509+
510+
<details>
511+
<summary>What are the two main SRE KPIs</summary><br><b>
512+
513+
Service Level Indicators (SLI) and Service Level Objectives (SLO).
514+
</b></details>
515+
516+
<details>
517+
<summary>What is Toil?</summary><br><b>
518+
519+
Google: Toil is the kind of work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows
520+
521+
Read more about it [here](https://sre.google/sre-book/eliminating-toil/)
522+
</b></details>
523+
524+
525+
<details>
526+
<summary>What is a postmortem ? </summary><br><b>
527+
528+
The postmortem is a process that should take place folowing an incident. It’s purpose is to identify the root cause of an incident and the actions that should be taken to avoid this kind of incidents from hapenning again. </b></details>
529+
530+
531+
<details>
532+
<summary>What is the core value often put forward when talking about postmortem?</summary><br><b>
533+
534+
Blamelessness.
535+
Postmortems need to be blameless and this value should be remided at the begining of every postmortem. This is the best way to ensure that people are playing the game to find the root cause and not trying to hide their possible faults.</b></details>
536+

0 commit comments

Comments
 (0)