Indexing Flashcards

1
Q

What is the system recommodation for the reference indexer?

A

12 cores, 2+ GHz, 12 GB RAM, 800 IOPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the system recommodation for the high end indexer?

A

48 cores, 2+ GHz, 128 GB RAM, 1200 IOPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give an example of how 800 IOPS can be reached?

A

By using eight x-GB, 15,000 RPM, serial-attached SCSI (SAS) HDs in a Redundant Array of Independent Disks (RAID) 1+0 fault tolerance scheme as the disk subsystem.

Each hard drive is capable of about 200 average IOPS. The combined array produces a little over 800 average IOPS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a realiable methode to meassure IOPS on a disk subsystem?

A

bonnie++ or FIO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In case you need to meassure the IOPS at customer site with shared storage, what do you need to consider?

A

To perform the test on all indexers at the same time to get a reliable results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

List the index artifacts and where they are located?

A

The indexing artifacts are stored under $SPLUNK_DB/etc/var/lib/

Data is stored in buckets. One index can contain several buckets.

There are different types of buckets (hot,warm,cold)

Frozen buckets per default will be deleted. You can specify to archive them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Does hot and warm buckets can be seperated on disk?

A

No, they do live under the same directory.

The path of hot/and warm buckets and be configured with homePath.maxDataSizeMB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can warm/hot buckets be seperated from cold? If so, what would be a common use case?

A

Yes, they can be seperated.

A common use case would be different underlying storage systems, eg hot/warms on high performance storage and cold on slower storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the default time until data in an index gets frozen?

A

~6 years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the rolling behaivor for maxDataSize?

A

Hot to warm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the rolling behavior of maxWarmDBCount?

A

Warm to cold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you configure maximum size for cold storage?

A

coldPath.maxDataSizeMB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the default setting for maxTotalDataSizeMB?

A

500000 MB [~500GB]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the rolling behavior maxTotalDataSizeMB?

A

Cold to frozen [based on size]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you configure maximum size of an index?

A

maxTotalDataSizeMB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the rolling behavior for frozenTimePeriodInSeconds?

A

Cold to frozen [based on time]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the default setting for maxHotBuckets?

A

3

18
Q

What 3 bucket controls are settable in the GUI?

A

1) maxTotalDataSizeMB
2) maxDataSize
3) timePeriodInSecsBeforeTsidxReduction

19
Q

What is the default setting for maxDataSize?

A

auto (sets the size of hot buckets to 750MB)

You should use “auto_high_volume” for high-volume indexes (such as a firewall index); otherwise, use “auto”. A “high volume index” would typically be considered one that gets over 10GB of data per day

20
Q

How do you configure the maximum number of warm buckets?

A

maxWarmDBCount

21
Q

What is maxTotalDataSizeMB applied to?

A

to both, homepath and coldpath

22
Q

Why using volumes can be a good aproach?

A

Using volumes helps to prevent failures in index size calculation.

A volume offers the possibility to assing several index to one volume. The volume has a maximum limit. That does prevent indexes to growth until the maximum disk space reaches and keeps the index size under control.

hot/warm and cold can be on different volumes

23
Q

Which bucket type needs to have a volume defintion to work?

A

tstatsHomePath (Accelerated Data Models)

24
Q

What is the differene between a pipline and processor?

A
  • Pipeline : A thread. Splunk creates a thread for each pipeline. Multiple pipelines run in parallel.
  • Processor: Processes in pipeline
25
Q

How does the word ‘queue’ fit into the picture of a pipline and processors?

A

Each pipepline has a seperat queue where the data ‘waits’ to be processed, similar to a mail or printer queue. Its a memory space between pipelines to store data.

26
Q

In which pipepline is the UTF-8 processor located?

A

Parsing Pipeline

27
Q

In which pipepline is the annotator processor located?

A

Typing Pipeline

28
Q

In which piepline is the indexandforward processor located?

A

Indexing Pipeline

29
Q

What happens when an event enters the linebreaker processor?

A

Splits data stream into events based on the linebreaker configuration

30
Q

What processor anonymizing sensitive data?

A

regexreplacement processor in the typing pipeline

31
Q

Which pipelines use props.conf?

A

All of them using props.conf, except indexing pipeline

32
Q

Which pipeline uses outputs.conf?

A

Indexing pipeline

33
Q

Which pipeline and which processor does the timestamp extraction?

A

Typing pipeline, aggregator processor

34
Q

In case the ‘great eight’ have been configured, which pipeline will have a significant lower load?

A

Parsing pipeline and merging pipeline

35
Q

What is the average percentage of compressed data itself (journal.gz) residing in buckets?

A

15%

36
Q

What is the average size in percent of the lexicon (TSIDX) files which reside in a bucket?

A

35%

37
Q

What is the average compression for all indexed data?

A

50% - which containts 15% journal.gz and 35% TSIDX

38
Q

What kind of data does live under the tstatsHomePath?

A

Accelerated Data Models

39
Q

What kind of data does live under the summaryHomePath?

A

Accelerated Reports

40
Q

Which pipepline forwards data to the nullQueue?

A

Typing pipeline

41
Q

What are thawed buckets?

A

Data restored from an archive. If you archive frozen data, you can later return it to the index by thawing it.