L7 - Cloud Monitoring Flashcards

1
Q

Why monitor?

A

To make best use of your rented resources to reduce your costs and increase satisfaction of the users of your service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

observable system

A

one that exposes enough data about itself so that generating information (finding answers to questions yet to be formulated) and easily accessing this information becomes simple

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

monitoring

A

process of collecting status information of applicaitons and resources; the data can be used to observe application and infrastructure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

monitoring system

A

consists of all components for gathering monitoring data at runtime

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

2 ways to create information

A
  • proactively: through continuous analysis for triggering alarms or to give an overview of the status of the system
  • reactively: triggering through events such as incidents (e.g. root cause analysis and autoscaling)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the purpose of monitoring at the infrastructure level?

A
  • resource management
  • incident detection
  • root cause analysis
  • metering for payment
  • auditing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the purpose of monitoring at application level?

A
  • performance analysis
  • resource management
  • failure detection and resolution
  • SLA verification
  • auditing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does monitoring take place in a parallel system?

A
  • batch system
  • data are collected during an application run
  • analysis happens post mortem
  • execution is reproducable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does monitoring take place in the cloud?

A
  • interactive system
  • data are continuously produced - realtime data
  • realtime analysis
  • data used for immediate action or to study past system behavior
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

3 pillars of monitoring

A
  • metrics
  • logs
  • traces
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

4 important metrics in monitoring

A
  • latency
  • throughput or traffic
  • error rate
  • utilization or saturation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is latency

A
  • time it takes to service a request
  • selectively measures successful or error requests
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is throughput or traffic?

A
  • web services: requests/second
  • streaming system: network I/O rate or concurrent sessions
  • database: transaction/second or retrievals per second
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the error rate?

A
  • rate of requests that fail
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is utilization or saturation?

A
  • percentage of capacity
  • CPU, memory, I/O
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

For what are metrics collected for Microservices?

A

Autoscaling, performance tuning

17
Q

What for are metrics collected for the platform like K8s or Docker?

A

container distribution, autoscaling VM cluster

18
Q

What are metrics collected for the infrastructure?

A
  • root cause analysis
19
Q

For what are metrics collected for the hardware?

A

management of VMs

20
Q

Monitoring system requirements

A
  • comprehensive
  • low intrusion
  • extensibility
  • scalability
  • elasticity
  • accuracy
21
Q

What is Blackbox Monitoring?

A
  • the monitoring system is handled as a black box
  • no data are gained from inside of the system
  • e.g. only the request interface of a service is visible. Nothing about the internal structure

from internet:
Black box monitoring refers to the monitoring of servers with a focus on areas such as disk space, CPU usage, memory usage, load averages, etc.

22
Q

What is whitebox monitoring?

A
  • data is also from inside of the system
  • this gives more context and more detailed insights
  • e.g. internal organization of a service is visible

e.g. Performing advanced detection of behavior we don’t expect to see, such as a user not going through the normal steps you’d expect when signing into your application or resetting a password.

23
Q

Why is there overhead in monitoring?

A
  • instrumentation
  • computation for aggregations
  • memory overhead for buffering
  • time to push to disk or transfer to collector
  • storage overhead for long-term storage
24
Q

What is instrumentation?

A

Instrumentation is the process of adding code to your application so you can understand its inner state

25
Q

What does overhead lead to

A

intrusion

26
Q

How can overhead in monitoring be reduced?

A
  • number of metrics
  • measurement frequency
  • representation
  • batching
  • sampling
  • long-term coarsening
27
Q

What is a log?

A

sequence of immutable records of discrete events

28
Q

What can an event log be composed of?

A
  • plaintext = most common format of logs
  • structured = much evangelized, typically JSON
  • binary = think logs in the Protobuf format
29
Q

What is ELK Stack?

A

ELK is the acronym for three open-source projects:
- Elasticsearch: search and analytics engine
- Logstash: server-side data processing pipeline
- Kibana: lets users visualize data with charts and graphs

30
Q

What is Elastic Stack

A

next evolution of ELK Stack

= the open source, distributed, RESTful, JSON-based search engine. Easy to use, scalable and flexible, it earned hyper-popularity among users and a company formed around it, you know, for search.
ELK + Beats and X-Pack