Teams (DevOps, SREs) Flashcards
(21 cards)
The CALMS acronym describes the DevOps culture as a list of objectives:
Culture, automation, lean, measurement and sharing
The C part of the CALMS acronym stands for […], and states that…
Culture, and states that everyone should work together with shared values
The A part of the CALMS acronym stands for […], and states that…
Automation, and states that everyone should strive to automate as many manual tasks as possible
The L part of the CALMS acronym stands for […], and states that…
Lean, and states that everyone must seek to eliminate waste, as it delays a product without improving it
The M part of the CALMS acronym stands for […], and states that…
Measurement, and states that metrics and logs must be monitored obsessively to help problems become apparent immediately
The S part of the CALMS acronym stands for […], and states that…
Sharing, and states that everyone must share information to enable collaboration between dev and operations teams
Developers are concerned with [agility/stability], while operators are concerned with [agility/stability].
Agility, stability
DevOps’ goal is to…
Break down the wall between developers and operators
One should reduce organisational silos, because…
Success comes from interoperation between cross-functional teams
Service-Level Objectives (SLOs) are…
Measurable targets for a specific aspect of service performance
We can help accept failure as normal by using…
Postmortem analysis and SLOs
Site Reliability Engineering is a team that works alongside DevOps by…
Focusing mostly on availability and reliability of the software product itself
Site Reliability Engineering teams implement the C part of the CALMs acronym, which stands for […], by…
Culture, by having a separate SRE team, though often SRE people are embedded in development teams, often as consultants
Site Reliability Engineering teams implement the A part of the CALMs acronym, which stands for […], by…
Automation, by using software to eliminate toil, which is repetitive and boring operations work necessary to run a service
Site Reliability Engineering teams implement the L part of the CALMs acronym, which stands for […], by…
Lean, by limiting the amount of ‘work in progress’ via a control loop driven by error budget
Site Reliability Engineering teams implement the M part of the CALMs acronym, which stands for […], by…
Measurement, by choosing a small number of metrics and monitoring them obsessively
Site Reliability Engineering teams implement the S part of the CALMs acronym, which stands for […], by…
Sharing, through the sharing of knowledge, tools and techniques
An SRE is supposed to spend a significant amount of time on engineering work, and the rest…
On-call
A service-level indicator (SLI) is a metric which is…
A quantitative measure of some aspect of the level of service provided
A service-level objective (SLO) is a metric which is…
A target value or range of values for an SLI
A service-level agreement (SLA) is a metric which is…
A contract which sets out the consequences for meeting or missing an SLO