Final Part 3 Flashcards

1
Q

Why is trouble shooting often viewed as an innate skill that some people have and others don’t?

A

For those who troubleshoot often, it’s an ingrained process; explaining how to troubleshoot is difficult, much like explaining how to ride a bike.

However, troubleshooting is both learnable and teachable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What two factors explain why novices are often tripped up by troubleshooting?

A

An understanding of how to troubleshoot normally (i.e., without any particular system knowledge) and a solid knowledge of the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Hypothetico-deductive method

A

Given a set of observations about a system and a theoretical basis for understanding system behavior, we iteratively hypothesize potential causes for the failure and try to test those hypotheses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the steps in a ideal troubleshooting model?

A

We’d start with a problem report telling us that something is wrong with the system.

Then we can look at the system’s telemetry and logs to understand its current state.

This information, combined with our knowledge of how the system is built, how it should operate, and its failure modes, enables us to identify some possible causes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Postmortem

A

A written record of an incident,

its impact, the actions taken to mitigate or resolve it,

the root cause(s),

the follow-up actions to prevent the incident from recurring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why are Postmortems needed?

A

If incidents don’t have some formalized process of learning from, the incidents can multiply in complexity or even cascade, overwhelming a system and its operators and ultimately impacting our users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Reasons to monitor a system include

A

Analyzing long-term trends

Comparing over time or experiment groups

Alerting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The four golden signals of monitoring

A

Latency, traffic, errors, and saturation

If you can only measure four metrics of your user-facing system, focus on
these four.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Latency

A

The time it takes to service a request. It’s important to distinguish between the latency of successful requests and the latency of failed requests. It’s important to track error latency, as opposed to just filtering out errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Traffic

A

A measure of how much demand is being placed on your system, measured in a high-level system-specific metric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Errors

A

The rate of requests that fail, either explicitly (e.g., HTTP 500s), implicitly, or by policy (for example, “If you committed to one-second response times, any request over one second is an error”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Saturation

A

How “full” your service is. A measure of your system fraction, emphasizing the resources that are most constrained (e.g., in a memory-constrained system, show memory; in an I/O-constrained system, show I/O).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If you measure all four golden signals and page a human when one signal is problematic

A

Your service will be at least decently covered by monitoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is boring a positive attribute of software?

A

You don’t want programs to be spontaneous and interesting; you want them to stick to the script and predictably accomplish their business goals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between essential complexity and accidental complexity?

A

Essential complexity is the complexity inherent in a given situation that cannot be removed from a problem definition, whereas accidental complexity is more fluid and can be resolved with engineering effort.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Jenkins

A

A continuous integration server written in Java. You can use it for testing and reporting changes in near real-time. Being a developer, it will help you to find and solve bugs in your code rapidly and automate the testing of their build.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Jenkins Features

A

– Free Open-Source Tool
– Integrate all your DevOps stages with the help of around 1000 plugins
– Script your pipeline having one or more build jobs into a single workflow
– Easily start your Jenkins with its WAR file
– Provides multiple ways of communication: web-based GUI, CLI and REST API

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the purpose of configuration management(CM)

A

Ensure the integrity of a product or system throughout its life-cycle by making the development or deployment process controllable and repeatable, therefore creating a higher quality product or system.

19
Q

The CM process allows orderly management of system information and
system changes for purposes such as to:

A
– Revise capability,
– Improve performance,
– Reliability or maintainability,
– Extend life,
– Reduce cost,
– Reduce risk and
– Liability, or correct defects.
20
Q

Chef

A

Configuration management technology used to automate the infrastructure provisioning. (The prerequisite steps in managing access to data and resources and facilitating systems and users’ availability. )

An automation tool that provides a way to define infrastructure as code.

21
Q

Infrastructure as code (IAC)

A

Managing infrastructure by writing code (Automating infrastructure) rather than using manual processes.

Programmable infrastructure

22
Q

The types of automation done by Chef, irrespective of the size of infrastructure are:

A
  • Infrastructure configuration
  • Application deployment
  • Configurations that are managed across your network
23
Q

Chef Features

A

– Another open-source configuration management tool
– Supports multiple platforms like AIX, RHEL/CentOS, FreeBSD
– Easy to integrate with cloud-based platforms
– Active, smart and fast-growing community support

24
Q

Companies using chef

A

Mozilla Firefox, Expedia, Walt Disney, hp, Facebook, and Rackspace

25
Q

Advantages of Chef

A
  • Fully automated deployment
  • Within a minute you can configure thousands of nodes.
  • Integrate with a cloud-based platform like AWS.
  • Chef keeps the system under consistent check.
  • You can record the entire infrastructure in the form of chef repository.
  • The chef plays a vital role in DevOps software lifecycle
26
Q

Disadvantages of Chef

A
  • Lacking documentation.
  • Chef needs code-based knowledge for scripting the tool which makes it complicated.
  • The master node can only be configurable in the Linux\Unix platform.
27
Q

What are the benefits of Chef?

A

Expedite Software Distribution

Enhances Service Flexibility

Remodels Risk Management

Collates your Infrastructure

28
Q

Ansible

A

An open-source tool that provides one of the simplest ways to automate your apps and IT infrastructures such as network configuration, cloud deployments, and creation of development environments.

29
Q

Ansible Features

A

– Open source configuration management tool
– Supports push configuration
– Based on master-slave architecture
– Completely agentless and uses simple syntax written YAML

30
Q

Companies that use Ansible

A

CapitalOne, NASA, and Viasat

31
Q

Companies that use Jenkins

A

Pentaho

OpenStack

AngularJS

Capgemini

Luxoft

Linkedin

32
Q

What does Ansible aim to do?

A

It aims to provide large productivity gains to a wide variety of automation challenges.

33
Q

Why do we need Ansible?

A

As data centers grew, and hosted applications became more complex, administrators realized they couldn’t scale their manual systems management as fast as the applications they were enabling by hand. It also hampered the velocity of the work of the developers since the development team was agile and releasing software frequently, but IT operations were spending more time configuring the systems.

34
Q

What problem is Ansible a solution Ansible solves

A

We need to keep updating, pushing changes, copying files on them
etc. These tasks make things very complicated and time-consuming.

35
Q

Nagios

A

A powerful monitoring system that enables organizations to identify and resolve IT infrastructure problems before they affect critical business processes.

36
Q

What does Nagios do?

A

It executes a continuous check on the crucial application, server resources, network, and tasks.

Monitors the memory usage of monitor and disk, the load of the microprocessor, number of processors, and logs currently running.

It can also check other services like Post office protocols 3, Simple Mail Transfer Protocol, HTTP protocols, and other available standard network protocols.

37
Q

The active and important checks are triggered by______, whereas the other secondary checks are triggered by ______ applications linked to the monitoring tool.

A

Nagios

External

38
Q

Nagios Architecture

A
  1. The scheduler is a component of the server part of Nagios. It sends a signal to execute the plugins at the remote host.
  2. The plugin gets the status from the remote host
  3. The plugin sends the data to the process scheduler
  4. The process scheduler updates the GUI and notifications are sent to admins
39
Q

What three separate products does Nagios consist of?

A

Nagios XI

Nagios Log Server

Nagios Fusion

40
Q

What do continuous monitoring tools help resolve?

A

Any system errors (low memory, unreachable server etc.) before they have any negative impact on your business productivity.

41
Q

What are Important reasons to use a monitoring tool?

A

– It detects any network or server problems

– It determines the root cause of any issues

– It maintains the security and availability of the service

– It monitors and troubleshoots server performance issues

– It allows us to plan for infrastructure upgrades before outdated systems cause failures

– It can respond to issues at the first sign of a problem

– It can be used to automatically fix problems when they are detected

– It ensures IT infrastructure outages have a minimal effect on your organization’s bottom line

– It can monitor your entire infrastructure and business processes

42
Q

When does continuous monitoring come into the picture?

A

Once the application is deployed on the production servers.

43
Q

What is continuous monitoring about?

A

The ability of an organization to detect, report, respond, contain and mitigate the attacks that occur, in its infrastructure.