Splunk 103 Flashcards

1
Q

What is Indexer Clustering?

A

Clustering is where multiple indexers are connected in order to maintain multiple identical copies of data. Clusters featuer “automatic failover”, which simply means when or if one indexer fails, the others will pick up the slack and maintain continuity in its activites.

This means:

  • Data is protected from sudden loss
  • More copies are available for users who are actively searching
  • The above acitvities will continue even when an indexer goes down
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What determines the number of copies kept withint a cluster?

A

Replication Factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Replciation Factor?

A

This determines how many copies are maintained within an indexer cluster.
Default RF is 3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Search Factor?

A

This determines how many copies in the cluster are immediately searchable
Default SF is 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the minimum amount of indexers that you have to have in cluster?

A

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are components of a cluster?

A
Cluster Peer (Peer Node) 
Cluster Master (Master Node)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are Cluster Peers?

A

Peers are the indexers that are in the cluster. They recieve and index incoming data, and replicate it to other peers. They respond to incoming searches by supplying search results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Cluster Master?

A

It manages cluster activities (such as adding peers, distributing configurations, determining number of copies to maintaing)
It maintains memory of peers, their buckets and configs, and tells search heads where to request data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does distributed search work in the cluster?

A

Search head “asks” Master Node in which indexers it should should search for the data it is searching for, and then it accesses those indexers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are benefits of clustering?

A

Data availability and fast recovery
Easier overall administration:
- Coordinated indexer configuration management
- Automatic distributed search set up
- Elastic indexer discovery
- Indexer health dashboard on Cluster Master
Scalability of Indexing
No additional license cost for data replication
Data fidelity
Data Resilency
Disaster Recovery
Search Affinity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are cons of clustering?

A

Increased storage requirements
Increased processing load (depends of RF and SF)
Requires additional Splunk instances:
Minimum: RF + CM + SH = # of insances required
REcommended: More than RF, and multiple SHs
Indexers require the same OS and versions
Requires cluster specific deployment management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are configuration bundles?

A

A set of configuration files and apps common to all peers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where do configuration bundles reside on cluster master and cluster peer

A

Cluster Master:
$SPLUNK_HOME/etc/master-apps

Cluster Peer:
$SPLUNK_HOME/etc/slave-apps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some of the configuration changes that require a restart?

A

Changes to indexes.conf, inputs,conf
Changes to a home path in indexes.conf
Dleeting an existing app

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are some of the configuration changes that do not need a restart?

A

Adding a new index or a new app with reloadable configs

Changes or additions to transforms.conf or props.conf

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Through which port do peers communicate with?

A

With replication port: 8080

17
Q

Through which port does master node, search head, and peers communicate?

A

Through management port: 8089

18
Q

Through which port does forwarder push data to indexers?

A

Through 9997 port (recieving port)

19
Q

To participate in the indexecr cluster, all nodes, including the search head must use the same…

A

pass4SymmKey

20
Q

What is the best practice for setting up cluster master/deployment server architectures?

A

DS ———-> CM ———> INDEXING CLUSTER

21
Q

What is search affinity?

A

The ability to configure a multisite indexer cluster so that each search head gets its search results from peer nodes on its local site only, as long as the site is valid. Search affinity has the benefit of reducing network traffic while still providing access to the full set of data.

22
Q

What is a multisite indexer cluster?

A

An indexer cluster that spans multiple physical sites, such as data centers. Each site has its own set of peer nodes and search heads. Each site also obeys site-specific replication and search factor rules.

23
Q

Name the 5 pros of indexer clustering

A
  • Data Availability
  • Data Fidelity
  • Data resilency
  • Disaster Recovery
  • Search Affinity
24
Q

What is Data Availbility in Splunk?

A

..

25
Q

What is Data Fidelity?

A

This term is used to define when data is transmitted from one sensor node to another, retains its actual meaning and granularity.

26
Q

What is Data Resliency

A

Data Resiliency
The term “data resiliency” refers to data’s ability to “spring back” in situations where it is compromised. In the cloud, data is resilient because it can be stored in a number of different locations. No one location is better than the other, availability is just improved by the more places data is stored, specifically in the event a location goes down or the data becomes corrupted. Users have access to data so long as the location they are storing their data at is accessible and the data isn’t compromised. If the one location goes down, users are directed to the second location. If all locations go down, then the organization no longer has access to its data.

Comparable to having keys to one’s house… The more keys you have, the less likely you are to get locked out. Hiding a key outside and keeping one hooked on your key chain assure higher resiliency. If you lose your keys or a key breaks, you can go use your hidden key outside. If you lose all the keys to your house, then you aren’t able to get in.

Resiliency is the ability of a server, network, storage system, or an entire data center, to recover quickly and continue operating even when there has been an equipment failure, power outage or other disruption.