Advanced Searching and Reporting Flashcards

1
Q

What kind of searches are prime candidates for optimization

A

Searched that run often or query broad amounts of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is stored in a journal.gz file and a .tsidx file on the buckets within indexers?

A

Compressed raw event data is stored in journal

Reference to journals raw events is stored in .tsidx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the components of the .tsidx file?

A

Lexicon with unique terms from event data
Posting list provides reference to values array Values Array is has posting value and a seek address as reference into the journal.gz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a bloom filter?

A

A bit array associated with each bucket and search string used to predict if a lexicon term is likely to be found in the bucket

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Are false positives and negatives possible with a bloom filters?

A

False positive are possible,

False negatives are not possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the series of events for retrieving event data with a bloom filter?

A
  1. Searchstring bloom filter created
  2. Find buckets in index within timerange
  3. Compare search bloom to bucket bloom
  4. If a match, find search terms in .tsidx
  5. Use .tsidx to get events from journal.gz
  6. Do search time extractions for final filter
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the job inspector command.search.index inform you of?

A

The time to get location info in .tsidx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the job inspector command.search.rawdata inform you of?

A

Time to extract event data from journal.gz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the job inspector command.search.kv inform you of?

A

Time to perform search time field extractions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What do you use to calculate performance with the job inspector?

A

scanCount/time to get events per second including the time to read all events from disk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In a distributed environment will the search execute faster if commands are on the SH or the indexer?

A

Execute faster on the indexer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Where are transforming commands executed?

A

Operate on the entire results set on the Search head

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Does order of events matter when running a transforming command?

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the two types of streaming command?

A

Distributable - could be run on indexer

centralized - always run on search head

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Is the event order important for streaming commands?

A

Distributable - No

Centralized - Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When is a distributable command run on the search head vs the indexer?

A

Search head if any preceding commands are executed on search head
Indexer if all preceding commands execute on indexer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Do streaming commands need the entire event result set prior to executing?

A

Distributable - no

Centralized - yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Do streaming commands operate on the entire results set of event data?

A

No they operate on each event returned by a search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How does having more disk reads affect search execution?

A

More disk reads leads to longer search execution time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How does splunk decide which events to read after determining which buckets match bloom filters?

A

Tokens (or terms) from search string are compared to tokens in events and match results in event being read from disk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How are event tokens derived?

A

Derived by breaking up searches and event data using segmenters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are segmenters?

A

Major or minor breakers that separate searches and events into smaller pieces

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are major breakers?

A

Character set used to divide words, phrases, terms into large tokens: space, newline, carriage return, tab, [] () {} ! ? ; , ‘ “ &

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are minor breakers?

A

Used to divide large tokens into smaller tokens: / : = @ . - $ # % \ \ _

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Where are tokens created from event data stored?

A

.tsidx files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Where can you see how the base search was tokenized?

A

Use the job inspector and look for the token after ‘base lispy’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

In prefix notation where does the operator appear?

A

Before the operands. Ex with search index=web 21.12: lispy: [ AND 12 21 index::web]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is a directive?

A

An instruction for how part of a search should be processed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What does the case sensitive ‘TERM’ directive do?

A

Forces Splunk to only look for a complete value by searching only based on major breakers and skip minor breakers - term must be bound by major breakers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Can key value pairs be passed into the TERM() directive?

A

Yes because key=value only has a minor breaker in it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Does negation in a search yield negation in a lispy?

A

Works for negating single terms but not for terms that include minor breakers UNLESS you use TERM()
Ex that will work: NOT TERM(example.1)
Ex does not work: NOT example.1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Do wildcards in a search work in a lispy?

A

Only when they are at the middle or end of a string and have no major or minor breakers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How do index time fields appear in lispy?

A

Appear as field::value instead of field=value like they do in a search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

As more fields are extracted at index time what happens to the size of .tsidx files and resource usage at indexer?

A

Both are increased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What type of data can have indexed field extraction to a specific source type>?

A

Sourcetypes with certain types of structured data (JSON, CSV, W3C)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Do comparisons against fields extracted at search time result in filtering events returned from the disk?

A

No, all events from a sourcetype will still be read from the disk (except for an equals operator).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What would the lispy be for a search: index=web sourcetype=a_c status>400

A

Lispy: [ AND index::web sourcetype::a_c ]

Comparisons are not included when filtering what is read from disk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Does the TERM() directive work with aliases?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

When are lookups completed while using a lispy?

A

In the search itself, lookup can be done before the lispy is created.
Lookups done for transformations or other pipe commands can be done post lispy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What does a subsearch do?

A

Takes the results from an inner search and using boolean AND, combines the results with the outer search - an OR boolean is inserted between each inner search result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What command do subsearches typically begin with?

A

[ search search_criteria… ]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Is an inner search or outer search completed first?

A

Inner subsearch completed first

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

How do you send only specific fields of a subsearch to the outer search?

A

Using fields or results command

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is returned by default when using the results command in a subsearch?

A

First value of each specified field is returned with the field name and the field value

45
Q

How do you adjust the defaults for the |return command in a subsearch?

A

Specify a number with ‘count’ and omit a field name with $before field name:
|return 5 $ip_address => returns first 5 values of ip_address

46
Q

Can you return subsearch results as an alias?

A

Yes, use |return alias_name=field

47
Q

What is the time and event count limit on a subsearch?

A

60 seconds and 10,000 events

48
Q

Over what time range will a subsearch execute if the root search is run in real-time?

A

Run over all time by default unless setting earliest and latest restriction in subsearch

49
Q

Should you use stats and/or eval over using subsearch?

A

Yes, whenever possible especially if searches are executed often

50
Q

What does the append command do?

A

Using only historical data, appends results of subsearch to the current results

51
Q

Does the |append command overlay primary and subsearch results?

A

No, they will appear one after another in a graph (when used with both stats and timechart as main search)

52
Q

How do you get appended results to overlay primary search results?

A

Use | stats first(*) as *

Or | timechart first(*) as *

53
Q

What command will overlay search and subsearch results in one step?

A

|appendcols

54
Q

When should you take caution using the appendcols command?

A

When used with stats because if there is a null value in the data, the empty values get pushed up and values may not be aligned to fields appropriately

55
Q

For larger amounts of data, should you use append and appendcols?

A

No, they should be avoided as more efficient searches should be used

56
Q

If a search can be done with stats or join/union which is usually more efficient?

A

stats command is usually more efficient

57
Q

What does the join command do?

A

Combines results from two searches with a default of inner join - results only include events from first search that match the second search

58
Q

What does the left, or outer join argument of the join command define?

A

Includes results from only the first search and those events matching in the second search

59
Q

How do you define which fields to use for the join command?

A

Define after the |join command:

| join fields_to_use

60
Q

What search combines two or more result sets into a single search?

A

union

61
Q

What can be used to specify a result search when using the union command?

A

Use a regular search, subsearch or a data model(defined with datamodel command)

62
Q

Does the union command execute on the search head or indexer?

A

As a distributable streaming command it executes in parallel on indexers if all searches are distributable otherwise on search head

63
Q

What is the union syntax?

A

union datamodel:name1.dataset datamodel:name2.dataset …
Search1 |union [search2]
Search1 |union datamodel:search2 …

64
Q

What command puts numerical values into discrete sets?

A

| where binoptions is optional

bin binoptions field

65
Q

How are bin sizes set?

A

Through the span=size option

| bin fieldname span=size

66
Q

What happens if span created more buckets than the max specified by bins?

A

bins is ignored

67
Q

What command reformats chartable, tabular output as a stats-like output?

A

| xfield:x-axis yfield:data labels datafield:fields with the charted data

untable xfield yfield datafield

68
Q

What command reformats stats-like output as chartable, tabular output?

A

xyseries xfield yfield datafield

69
Q

What does the forearch command do?

A

foreach replaces the <> token with field names that match: |foreach www* [eval <> = round(<>/2)]

70
Q

What are multivalue eval functions used for?

A

Used to analyze and format multivalue data

71
Q

What command do you use to convert a single value into a multivalue field /

A

|makemv command

72
Q

What do JSON array contents become when auto extracted by Splunk?

A

Contents become multivalue fields

73
Q

In JSON data, what to the {} and [] indicate?

A

{} is an object, a grouping of field value pairs

[] is an array of objects

74
Q

How are fields nested within a JSON event represented when extracted into Splunk?

A

Event_name{}.field1

Event_name{}.field2 …

75
Q

Which commands can you use with multivalue functions?

A

eval where and fieldformat commands

76
Q

What does the mvsort() function do?

A

Intakes a multivalue field and returns the values sorted lexicographically

77
Q

How are numbers sorted in lexicographical order?

A

Numbers are sorted before letters and are sorted based on the first digit not the number as a whole: 100 200 70 9 is in order

78
Q

Are uppercase or lowercase letters first in lexicographical order?

A

Uppercase is first

79
Q

Can functions and commands process fields that contain a {}

A

No, so mv fields extracted from JSON will have to be renamed

80
Q

What does the mvfilter function do?

A

Filters(refines) one mvfield based on a boolean expression

81
Q

How do you remove null values returned from mvfilter function?

A

Use mvfilter(isnotnull(x))

82
Q

What function concatenates individual values from a mvfield and uses a delimiter as a separator?

A

mvjoin(fieldname, “delimiter”)

83
Q

What function takes 2 mvfields and concatenates the first values of each, the second values of each etc. with a delimiter to separate?

A

mvzip(mvfield1,mvfield2,”delimiter”)

84
Q

What does the mvcount function do?

A

Returns count of values in the specified mvfield returning null if no field values or field does not exist: mvcount(fieldname)

85
Q

What does the split function do?

A

Takes a single value field and a delimiter to split by and creates a new mvfield: split(fieldname, “delim”)

86
Q

What does the mvindex function do?

A

Take an mvfield and an integer to return a value at the specified integer index in the mv array - indexing starts at 0 NOT 1 : mvindex(fieldname, indexnum)

87
Q

What command converts an existing single value field to a mvfield based on delimiter or regex(referred to as a tokenizer)

A

makemv (delim=string | tokenizer=regex) fieldname

88
Q

What does the mvexpand command do?

A

Takes mvfield and creates separate event for each value in the mvfield:
|mvexpand fieldname

89
Q

Does mvexpand command create new events on disk/in index?

A

No, only created in memory for purposes of search at hand

90
Q

What is referred to as any group of conceptually related events?

A

A transaction via the | transaction command

91
Q

What does the |transaction command enable?

A

Enables you to specify criteria used to determine how to group events via ranges of time, # of events, text contained in events

92
Q

Is stats or transaction faster ?

A

Stats, as transaction is resource intensive and should be used only when stats is insufficient

93
Q

In what order do events need to be ingested for transactions to work?

A

Reverse chronological order

94
Q

In order to use |transaction, how do you correct events that are not coming in reverse cron order?

A

Use | sort -_time

Use right before |transaction

95
Q

How do you find events that occur before or after a specific event?

A

Use |transaction fieldname (endswith=() | startswith=() )

96
Q

What function is used to normalize field names by taking a number of arguments and returning first one that is not null and storing it as new field?

A

coalesce(field1,field2,…)

97
Q

What does the keepevicted=1 argument used for when dealing with transactions?

A

It is a setting used to retain any transactions where one or both of beginning/ending criteria are not satisfied (transaction did not complete successfully)

98
Q

What field is used to determine if a transaction is complete or incomplete?

A

closed_txn = 1 if transaction is a success or closed_txn = 0 if transaction is not a success

99
Q

What has to be met for a transaction to be closed?

A

One or more of these criteria are met: maxevents, maxpause, maxspan, startswith, endswith

100
Q

Where does |transaction execute?

A

Executes on the SH as it is a centralized streaming command

101
Q

Does |transaction require access to all the _raw data?

A

Yes, because search is forced to send all _raw data back from indexers to search head as transactions require all event data

102
Q

Can the timepicker be overridden during a search?

A

Yes through the earliest= & latest= time modifiers

103
Q

When snapping to a time, does the time round up or down?

A

Always rounds down (backward to a previous time)

104
Q

How do you define a search for the past 24 hours using earliest and latest modifiers?

A

Mainsearch earliest=-24h@h latest=@h

105
Q

What are default time fields?

A

date_* fields (time/date) stamps taken directly from raw events providing extra info for searching but these are not representative of time zone conversions or time value changes

106
Q

How would you exclude events from the current day?

A

latest=@d

107
Q

How would you include data beginning at the start of the day, 2 days ago?

A

earliest=-2d@d

108
Q

What does date_hour.=2 AND date_hour<5 represent?

A

Find events between 2am and 5am

109
Q

What function can be used to create a new field with adjusted time zones?

A

|eval new_time_field =strftime(_time, “%H”)

will get the hour from the event and convert the hour to you local time based on time zone setting