Section 5.1 Flashcards

1
Q

What does the Indexing Layer do?

A

Allows you to clean up data.

Allows you to refine data.

Allows you to store data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Index clustering?

A

When multiple indexers are connected in order to replicate copies of the indexers buckets (data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Where is data stored?

A

In indexes on the indexer that have buckets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is automatic failover?

A

Basically backing up data. If one indexer fails, the others will pickup the slack and maintain continuity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

High availability means…

A

Data is highly available for searching.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Index Clustering in summary means

A

Data is protected from sudden loss

More copies are available for users who are actively searching

Indexer activities will continue in the event an indexer goes down

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Replication Factor determines

A

How many copies are maintained within an indexer cluster.

Deafult RF is 3

Maximum RF is determined by the number of indexers you have or nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Search Factor determines

A

How many of these copies are immediately searchable.

Default SF is 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In a clustering environment you need a minimum of ____ Indexers

A

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Most important fact about a Search Factor (SF)

A

The Search Factor can never be more than the Replication Factor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Explain RF & SF

A

RF factor tells us how many times we want the data to be copied over. Two of those copies are highly available and just incase something happens to the first copy. If both copies go down, the third copy is usually stored at an offsite location.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When does the Cluster Master come in?

A

The Cluster Master comes into play when we start copying our data (when the environment becomes clustered).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cluster Master Manages what layer?

A

It manages the indexing layer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Cluster Master?

A

A centralized configuration Manager who’s job is to manage the indexer cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Once the environment becomes clustered, the Deployment Server….

A

Only manages the forwarders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a Cluster Master do?

A

Manages cluster activities (adding peers, distributing configurations, determines the number of copies to maintain).

Maintains memory of peers, their buckets, and configs

Tells search head where to request data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are Peers (Cluster Peer)?

A

Peers are Indexers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What do Peer Nodes do?

A

Peers receive and index incoming data typically from forwarders)

Replicate data to other peers

Respond to incoming searches by supplying search results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

A clustered architecture is called ..

A

A distributed search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Clustering is Smart because it provides….

A

Data Availability
Data Fidelity
Data Resiliency
Disaster Recovery
Search Affinity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Multi-site clustering =

A

Storing copies of your data at a different site

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Data fidelity =

A

The act of not losing data; reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Benefits of Clustering =

A

1.Data Availability & fast recovery
2.Easier overall administration
3.Scalability of indexing
4.No additional cost for data replication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Cons of clustering =

A

1.Increased storage requirements
2.Increased processing load
3.Requires additional Splunk instances
4.Indexers require the same OS and versions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

When you enable a search head in cluster environment you must specify what?

A

Cluster settings (i.e. Master Node) and the port on which it receives data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Transforms.conf=

A

specify transformations and lookups that can then be applied to any event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the filepath of the CM that sends apps to its peers ?

A

splunkhome/etc/master-apps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Where do bundles reside for cluster peer?

A

splunkhome/etc/slave-apps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Splunkhome etc slave apps =

A

where you will always find pushed configuration files (sent from CM to indexer)

30
Q

Config changes that require restart?

A

A.Changes to indexes.conf,inputs.conf
B.Home path changes to Indexes.conf
C.Deleting an existing app

31
Q

Configuration changes that do not need a restart ?

A

Adding a new index or new app with reloadable configs

Changes or additions to transforms.conf or props.conf

32
Q

Tell me about your environment

A

In my environment we have a current quota of about 50TB, and we are currently ingesting about 49TB per day with 600 users. We have about 290 indexers, with close to 32,000 forwarders and about 12 search heads.

33
Q

Environment with too many forwarders for you to manage one at a time-what Splunk instance would you install and how would you configure it to manage all the forwarders?

A

Use Deployment Server and put the forwarder in serverclass and create deployment apps to configure all of them.

34
Q

In your deployment app you are Configuring inputs.conf to bring in new data-you then search with search head and cannot find the data. What happened?

A

-didn’t send deployment apps to correct serverclass
-mistake in monitoring stanza
-did not put right index
-severclass has not phoned home
-turn monitoring on(BEST ANSWER)
-Splunk does not have permissions to read source file

35
Q

what directory must you place your inputs.conf file in the deployment app

A

local directory

36
Q

indexer uses what port

A

9997

37
Q

fishbucket index importance

A

allows you to see how far into a file indexing has occurred-helps to avoid duplicates and comes in handy after server shutdown or connection errors.

38
Q

advantages of indexer clustering

A

1.Data Availability & fast recovery
2.Easier overall administration
3.Scalability of indexing
4.No additional cost for data replication

A. Data Availability = how often your data is available to be utilized.

B. Data Fidelity = the act of not losing data.

C. Data Reliability = refers to the accuracy, consistency, and dependability of the data being ingested, indexed, and queried within the platform.

D. Data Resiliency = platform’s ability to maintain data availability, integrity, and accessibility even in the face of unexpected failures.

E. Disaster Recovery = set of processes and strategies put in place to ensure availability and continuity of Splunk services and data.

F. Search Affinity = search local sites; mechanism for intelligently routing and distributing search jobs across a distributed Splunk environment.

39
Q

explain data availability

A

how often your data is available to be utilized

40
Q

who manages all indexes in cluster environment? Explain

A

Cluster Master/Master Node

41
Q

how would you configure hot bucket to roll over by time

A

Maxhotspansecs

42
Q

default port used for replication

A

8080 is replication, 8089 is the management port(goes between config manager and clients-ds vs clients and then CM vs indexers-to ANY client it is managing), and 9997 is the data (receiving port)

43
Q

what is metadata and what does it contain?

A

Meta data=bar code=tells you where a product is coming from (ip address, log path, and format of data)

44
Q

What is source

A

name of the event or other input from which the event originates

45
Q

give examples of sourcetypes you worked with

A

json and syslog or CSV

46
Q

what is the largest sourcetype you have worked with?

A

syslog is network data and large

47
Q

high availability

A

-High availability=when we are replicating data within our indexers
-Multiple copies available for searching
-Data gets into our indexers in round robin fashion

48
Q

distributed search?

A

key feature that allows you to search and analyze data across multiple Splunk instances or indexers in a distributed Splunk deployment. This is especially useful in large-scale environments where the volume of data to be searched and analyzed exceeds the capacity of a single Splunk instance.

49
Q

how replicated buckets are stored in indexers

A

1.once the data comes to the indexers the method of distributing data will be round-robin 2.once the data is written on the indexers 3.then the process of replicating data will move from indexer to indexer trying to find a healthy one to store that specific data.

50
Q

how does forwarder distribute data among indexers without replication (regular data)

A

round robin fashion

51
Q

reloading vs restarting DS

A

When updating clients of the DS-reload deployment server

when you make updates for DS itself you restart DS.

52
Q

when increasing ingestion in cluster environment

A

add more indexers to the cluster

53
Q

some considerations to consider when going into clustered environment

A

cost of more splunk instances

ingestion of data

storage requirements

processing requirements

54
Q

You notice that your newly monitored data is not in the index that you have configured it to be in. Where is data possibly being stored and how would you troubleshoot it?

A

Go to the inputs.conf and validate that the ‘index’ is correct.

If index is wrong it will be in the main index

55
Q

Recently got fresh new data in the splunk

A

Hot bucket

56
Q

Under what circumstances would the data in the hotbucket stop writing?

A

If the hot bucket is too full or if their is restart.

57
Q

In order to have have splunk search head what would you need to download?

A

Splunk Enterprise

58
Q

Maximum number of concurrent users per search head

A

12

59
Q

What is Maxhotbucket?

A

Maximum hot bucket that can be in an index

60
Q

Which default port is for replication?

A

8080 port

61
Q

What is the thawing process

A

Frozen data has to be unthawed and sent back to cold

Move that file into thaw directory and rename it to a name that splunk recognizes

62
Q

What must happen before indexer can be part of a cluster?

A

Indexer must become cluster member

63
Q

Cluster Master/Master Node

A

You only need ONE

64
Q

Internal Index?

A

Used for troubleshooting; stores all Splunk components’ internal logs and processing metrics.

Searches for logs that say ERROR or WARN

65
Q

Monitoring stanza in Windows vs Linux

A

Windows = [monitor://C:\app\log\data\catalina.out]

Linux = [monitor:///another/random/path]

66
Q

Two types of files indexes consist of

A

raw data (full log files) and indexed files (tsidx)

67
Q

To disable the monitor to stop sending logs

A

Go to monitoring and change disable to true or 1

68
Q

Explain summary index

A

Summary indexing allows you to run fast searches over a large data set by scheduling Splunk to summarize data then import data into the summary index over time

69
Q

When increasing ingestion of data by 2 TB what will you have to do? In clustered environment-how would you accommodate it.

A

Adding indexers to the cluster to accommodate growth.

70
Q

What directory are apps deployed to in a clustered environment

A

slave-apps filepath

71
Q

When will you use management port?-8089

A

when CM is communicating with with its clients or slaves