Indexing Flashcards by Frank Nebel

What is the system recommodation for the reference indexer?

12 cores, 2+ GHz, 12 GB RAM, 800 IOPS

How well did you know this?

Not at all

Perfectly

What is the system recommodation for the high end indexer?

48 cores, 2+ GHz, 128 GB RAM, 1200 IOPS

How well did you know this?

Not at all

Perfectly

Give an example of how 800 IOPS can be reached?

By using eight x-GB, 15,000 RPM, serial-attached SCSI (SAS) HDs in a Redundant Array of Independent Disks (RAID) 1+0 fault tolerance scheme as the disk subsystem.

Each hard drive is capable of about 200 average IOPS. The combined array produces a little over 800 average IOPS.

How well did you know this?

Not at all

Perfectly

What is a realiable methode to meassure IOPS on a disk subsystem?

bonnie++ or FIO

How well did you know this?

Not at all

Perfectly

In case you need to meassure the IOPS at customer site with shared storage, what do you need to consider?

To perform the test on all indexers at the same time to get a reliable results

How well did you know this?

Not at all

Perfectly

List the index artifacts and where they are located?

The indexing artifacts are stored under $SPLUNK_DB/etc/var/lib/

Data is stored in buckets. One index can contain several buckets.

There are different types of buckets (hot,warm,cold)

Frozen buckets per default will be deleted. You can specify to archive them.

How well did you know this?

Not at all

Perfectly

Does hot and warm buckets can be seperated on disk?

No, they do live under the same directory.

The path of hot/and warm buckets and be configured with homePath.maxDataSizeMB

How well did you know this?

Not at all

Perfectly

Can warm/hot buckets be seperated from cold? If so, what would be a common use case?

Yes, they can be seperated.

A common use case would be different underlying storage systems, eg hot/warms on high performance storage and cold on slower storage.

How well did you know this?

Not at all

Perfectly

What is the default time until data in an index gets frozen?

~6 years

How well did you know this?

Not at all

Perfectly

What is the rolling behaivor for maxDataSize?

Hot to warm

How well did you know this?

Not at all

Perfectly

What is the rolling behavior of maxWarmDBCount?

Warm to cold

How well did you know this?

Not at all

Perfectly

How do you configure maximum size for cold storage?

coldPath.maxDataSizeMB

How well did you know this?

Not at all

Perfectly

What is the default setting for maxTotalDataSizeMB?

500000 MB [~500GB]

How well did you know this?

Not at all

Perfectly

What is the rolling behavior maxTotalDataSizeMB?

Cold to frozen [based on size]

How well did you know this?

Not at all

Perfectly

How do you configure maximum size of an index?

maxTotalDataSizeMB

How well did you know this?

Not at all

Perfectly

What is the rolling behavior for frozenTimePeriodInSeconds?

Cold to frozen [based on time]

How well did you know this?

Not at all

Perfectly

What is the default setting for maxHotBuckets?

Study These Flashcards

What 3 bucket controls are settable in the GUI?

Study These Flashcards

1) maxTotalDataSizeMB
2) maxDataSize
3) timePeriodInSecsBeforeTsidxReduction

What is the default setting for maxDataSize?

Study These Flashcards

auto (sets the size of hot buckets to 750MB)

You should use “auto_high_volume” for high-volume indexes (such as a firewall index); otherwise, use “auto”. A “high volume index” would typically be considered one that gets over 10GB of data per day

How do you configure the maximum number of warm buckets?

Study These Flashcards

maxWarmDBCount

What is maxTotalDataSizeMB applied to?

Study These Flashcards

to both, homepath and coldpath

Why using volumes can be a good aproach?

Study These Flashcards

Using volumes helps to prevent failures in index size calculation.

A volume offers the possibility to assing several index to one volume. The volume has a maximum limit. That does prevent indexes to growth until the maximum disk space reaches and keeps the index size under control.

hot/warm and cold can be on different volumes

Which bucket type needs to have a volume defintion to work?

Study These Flashcards

tstatsHomePath (Accelerated Data Models)

What is the differene between a pipline and processor?

Study These Flashcards

Pipeline : A thread. Splunk creates a thread for each pipeline. Multiple pipelines run in parallel.
Processor: Processes in pipeline

How does the word 'queue' fit into the picture of a pipline and processors?

Each pipepline has a seperat queue where the data 'waits' to be processed, similar to a mail or printer queue. Its a memory space between pipelines to store data.

In which pipepline is the UTF-8 processor located?

Parsing Pipeline

In which pipepline is the annotator processor located?

Typing Pipeline

In which piepline is the indexandforward processor located?

Indexing Pipeline

What happens when an event enters the linebreaker processor?

Splits data stream into events based on the linebreaker configuration

What processor anonymizing sensitive data?

regexreplacement processor in the typing pipeline

Which pipelines use props.conf?

All of them using props.conf, except indexing pipeline

Which pipeline uses outputs.conf?

Indexing pipeline

Which pipeline and which processor does the timestamp extraction?

Typing pipeline, aggregator processor

In case the 'great eight' have been configured, which pipeline will have a significant lower load?

Parsing pipeline and merging pipeline

What is the average percentage of compressed data itself (journal.gz) residing in buckets?

15%

What is the average size in percent of the lexicon (TSIDX) files which reside in a bucket?

35%

What is the average compression for all indexed data?

50% - which containts 15% journal.gz and 35% TSIDX

What kind of data does live under the tstatsHomePath?

Accelerated Data Models

What kind of data does live under the summaryHomePath?

Accelerated Reports

Which pipepline forwards data to the nullQueue?

Typing pipeline

What are thawed buckets?

Data restored from an archive. If you archive frozen data, you can later return it to the index by thawing it.

Indexing Flashcards

(41 cards)