REGEX Flashcards

1
Q

Why does Regex matter?

A

-Clarifies how splunk will bring data into indexes
-Shows you how to look at logs to see individual events
-Gives insight as to why and how regex can be applied
-Will help in the onboarding of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Logs vs Events

A

A. LOG = contains specific types of events, documenting all the records that happened in a particular time

B. EVENT = located within a log and it’s action that happens that makes it unique from the other “things” that happen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Process of Logs Rolling into Linux

A

-When logs roll in they create new files; as it ages it rolls off into another syslog like syslog.1

-Most current logs are in the none number ones; “.z” is for zipped files and is used to save space

-Some logs will rollover by date; sometimes logs rollover by volume

-Most of the rollover stuff Splunk has already ingested and has been indexed as events; are essentially backup files that can be cleaned up/deleted if you need more space in Linux

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does Splunk know what logs to watch and ingest into its indexes?

A

Inputs monitoring stanza

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Reasons for lags in Splunk ingesting logs into indexes

A

-There are large geographical time differences

-System might not be running efficiently

-Servers might have outdated hardware

-Splunk system might not be scaled to handle logs

-Forwarder might not be working properly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ingestion rate of log files

A

Frequency of ingestion is once per day is one way to ingest log files but usually live streamed is the preferred method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True or False-Each log has its own IP address= different users

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Get Command?

A

Get command= you want to view different components of the cart/website

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the purpose of JessionID?

A

given to visitors who log in and out/identify users of that site only for the duration they use it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does Regex do?

A
  1. FIltering-eliminate unwanted data from searches
  2. Matching-advance pattern matching to find results you need
  3. Field Extractions-labeling bits of data that can be used for further calculations and analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Greedy Regex vs Lazy Regex

A

Greedy Regex= means that your regex statement is using an operator that keeps “gobbling” up matches until it is interrupted

Lazy Regex= means your Regex will only match first occurrence of the pattern you are looking for before it stops

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Index vs Search Time Extractions

A

At Index Time: the indexers are told to also extract fields/value pairs from the data AS they are committing the data to disk.

***Burdens the indexers and can affect performance

At Search Time: the search head perform extractions as it is bringing back your data from the indexers.

***Better option, these field extractions act as knowledge objects that reside on SH.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or False? Metadata is attached to every single event so that you know where the event came from.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe PROD, DEV, UAT, & SANDBOX

A

PROD-Production
Where all changes are finalized and go live; end users or customers use this environment

DEV-Development
Most commonly used for testing and development. Sometimes isolated from prod, sometimes connected, this environment is unpublished and an exact copy of prod

UAT-User Acceptance Test
Once the testing phase is over, the user which will be using the application must okay your work on their end

SANDBOX-testing
This is an isolated environment where you can safely write code and run tests without any communication with production environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Standard flow of data through environments

A

1.First data goes into dev ;onboard your data
2.Work done in dev will be copied to UAT(UAT is middle environment for the team that needs your work to review what you’ve done)
3.then info from UAT will be sent to Prod(Do not make ANY changes without permission)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

List the different ways to onboard data.

A

-UF
-Syslog
-HEC
-API Collection
-Scripted Inputs
-One Shot Upload

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

List primary .conf files for onboarding

A

-Inputs.conf
-Outputs.conf

-Authentication.conf: configuration file that specifies how Splunk users are authenticated.

-Authorize.conf: specifies the permissions that Splunk users have

-Serverclass.conf: specifies the properties of Splunk server classes.

-Props.conf
-Transforms.conf: specifies how Splunk transforms data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Outline onboarding data in Splunk

A

1-Gather requirements from data owner
2-Decide between a custom or pre-made TA
3-Decide which method of onboarding you will use to bring data in: API/UF/syslog/Dbconnect/Scripted inputs
4-Determine where the data will reside in splunk-what index will you use and what sourcetype is linked to that data
5-Obtain a sample of data or log you will bring into splunk-sample from data owner-use props.conf config to test data then bring data in thru all appropriate config files
6-You will decide who needs access to this data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Process for Building Custom TA/APP

A

Step 1: Create a new directory for your TA.

The directory should be located in the $SPLUNK_HOME/etc/apps directory.

Step 2: Create an app.conf file in the new directory.

The app.conf file contains the configuration settings for your TA.

Step 3: Add the following configuration settings to the app.conf file:

name: The name of your TA.
version: The version of your TA.
author: The author of your TA.
description: A description of your TA.

Step 4: Add any additional configuration settings that your TA needs.

For example, if your TA includes a dashboard, you will need to add a dashboard.conf file to the app directory.

Step 5: Create any scripts that your TA needs.

For example, if your TA includes a search, you will need to create a search.sh file in the app directory.

Step 6: Package your TA.

This will create a single file that contains all of the configuration files and scripts for your TA.

Step 7: Test your TA.

Make sure that your TA is working properly by deploying it to a test environment and testing all of its features.

Step 8: Deploy your TA to Splunk.

You can do this using the Splunk Web UI or the Splunk CLI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Global Stanza vs Local Stanza

A

Global modifications = default settings that set the standard for and is applied to all configurations beneath it

Local modifications override global

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

More definitions of attributes

A

-no web.conf for UF
-props.conf = for indexers
-inputs.conf is needed to tell forwarder where to collect data
-server.conf=for clustered environment and is already in default for non-cluster
-inputs.conf opens port 9997 on indexer–on forwarder it is collecting data
Server.conf is in system/local to list server name and is part of setting

22
Q

Where does Authentication.conf and authorization.conf belong

A

SH

23
Q

what is a field and value pair in splunk ?

A

a single unit of data consisting of a field name and a field value.

Field names are used to identify the type of data that is being stored

field values are the actual data

24
Q

What does Control Characters (anchors) do?

A

They tell regex where to start and stop looking

25
Q

What control characters do you know? What do they do?

A

^ - start of a line

$ - end of a line

\b - whole words only

\B - bordered by word characters

26
Q

What are character Classes?

A

Character classes distinguish kinds of characters such as, for example, distinguishing between letters and digits

27
Q

What are the character classes that you know? What do they do?

A

\s - white space

\S - not a white space

\d - digit

\D - non-digit

\w -word-character (includes letters, numbers and underscore)

\W - non-word character

28
Q

what is the escape key in regex

A

the backslash character ().

It is used to escape special characters in REGEX so that they are treated literally

29
Q

What operators you know? What do they do?

A
    • zero or more

+ - one or more

? - Zero or more

30
Q

what are special characters in regex

A

For Example *+?$/[]{}()

31
Q

What is the symbol for “OR” condition?

A

|

32
Q

What is the special character for matching any character?

A

.

33
Q

What is a protection character in splunk?

A

a special character that can be used to prevent Splunk from parsing certain characters in a field

Protection characters can be used in a variety of ways in Splunk, including:

To protect sensitive data in indexes and search results
To prevent Splunk from parsing certain delimiters in input data
To prevent Splunk from parsing certain characters in field names

Examples: \, @ , “ , []

34
Q

how do you protect password field with protection character?

A

enclose it in the @ character:

@password@
Splunk will then ignore all characters inside the protection character.

35
Q

how do you use protection characters to prevent Splunk from parsing certain delimiters?

A

use protection characters to prevent Splunk from parsing certain delimiters. For example, to prevent Splunk from parsing the comma delimiter in a CSV file, you would enclose the field in the “ character:

“value1”,”value2”,”value3”
Splunk will then treat the comma character as a literal character, and will not parse it as a delimiter.

36
Q

examples of how to use protection characters in Splunk:

A

Protect the password field in an index
[sourcetype=linux_secure]
INDEX = my_index
TRANSFORMS-protect_password = protect_password

[protect_password]
REGEX = @password@
DEST_KEY = password
FORMAT = $1

Prevent Splunk from parsing the comma delimiter in a CSV file
[sourcetype=csv]
DELIMS = ,
TRANSFORMS-protect_comma = protect_comma

[protect_comma]
REGEX = “
DEST_KEY = quote
FORMAT = $1

Prevent Splunk from parsing the colon character in a field name
[sourcetype=linux_secure]
FIELDS = host:ip source:message

Search for all events where the password field contains the word “password”
index=my_index sourcetype=linux_secure @password@=”password”

37
Q

What are the inclusion characters?

A

used to specify which fields should be included in a search result

Examples:
( ) parentheses
{ } curly braces
[ ] square brackets

search query will include the host and message fields in the search results:

(host=”example.com” AND message=”error”)

38
Q

What are the exclusion characters in splunk?

A

used to specify which fields should be excluded from a search result.

Ex:
! exclamation point
NOT keyword

search query will exclude all events where the host field is equal to example.com:

!host=”example.com”

39
Q

What is repetition?(regex)

A

a way to match multiple occurrences of a character or pattern

Two main types of repetition in regex: greedy and non-greedy

40
Q

greedy vs non-greedy repetition

A

Greedy repetition matches as many occurrences of the character or pattern as possible. For example, the regular expression a* will match any string that contains one or more occurrences of the letter a.

Non-greedy repetition matches the minimum number of occurrences of the character or pattern possible. For example, the regular expression a+? will match any string that contains one or more occurrences of the letter a, but will stop matching as soon as it finds the first occurrence of another character

41
Q

Examples of greedy & non-greedy repetition

A

a{3}: any string that contains three occurrences of the letter a
[0-9]+: any string that contains one or more occurrences of a digit
(abc){2,3}: any string that contains two or three occurrences of the pattern abc
[a-z]{3,}: any string that contains three or more occurrences of a lowercase letter

42
Q

How would you use pattern repetitions in regex?

A

Matching a specific number of occurrences of a character or pattern. For example, the regular expression [a-z]{3} matches any string that contains three lowercase letters.

Matching a minimum or maximum number of occurrences of a character or pattern. For example, the regular expression [0-9]{3,5} matches any string that contains three to five digits.

Matching a pattern that occurs one or more times. For example, the regular expression (abc)+ matches any string that contains one or more occurrences of the pattern abc.

43
Q

What are logical groupings?

A

a way to group patterns together so that they can be treated as a single unit.

There are two main types of logical grouping in regular expressions:

Non-capturing groups: These groups are used to group parts of a regular expression together, but they do not capture the matched text.
Capturing groups: These groups are used to group parts of a regular expression together and capture the matched text.

44
Q

What are the two main types of logical groupings in regex:

A

parentheses and alternation.

Parentheses can be used to group patterns together so that they can be treated as a single unit. For example, the regular expression (abc) matches the pattern abc treated as a single unit.

Alternation can be used to match one of two patterns. For example, the regular expression (abc|def) matches either the pattern abc or the pattern def.

45
Q

How would you set up logical groupings?

A

Logical groupings can be used to create complex regular expressions that can match a wide variety of patterns. For example, the following regular expression can be used to match all email addresses:

[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}
The regular expression is broken down into the following groups:

[a-zA-Z0-9._%+-]+ matches the local part of the email address.
@[a-zA-Z0-9.-]+.[a-zA-Z]{2,} matches the domain part of the email address.

The two groups are separated by the @ character, which is used to group the two parts of the email address together.

46
Q

How would you set up logical groupings? (Pt.2)

A

Logical groupings can also be used to exclude certain patterns from a match. For example, the following regular expression can be used to match all phone numbers that do not contain the digit “1”:

\d{10}(?!1)
The regular expression is broken down into the following groups:

\d{10} matches any 10-digit phone number.
(?!1) is a negative lookahead assertion that excludes any phone number that contains the digit “1”.
The negative lookahead assertion is placed at the end of the regular expression so that it only excludes phone numbers that contain the digit “1” at the end of the number.

47
Q

What are named capture groups?

A

a way to extract specific parts of a string using regular expressions

way to name the groups in a regular expression.

For example, the following regular expression creates a named capture group called username:

(?<username>\w+)</username>

48
Q

What is negative lookahead and how it works?

A

a way to match a pattern that is not followed by another pattern. It is denoted by the syntax (?!pattern)

For example, the following regular expression matches any string that contains the word “error” but is not followed by the word “warning”:

error(?!warning)

49
Q

What is positive lookahead and how it works?

A

a way to match a pattern that is followed by another pattern. It is denoted by the syntax (?=pattern)

50
Q

What is positive lookahead and how it works in regex?

A

For example, the following regular expression matches any string that contains the word “error” followed by the word “warning”:

error(?=warning)

51
Q

What is Rex?

A

REX, or Regular Expression Extractor, is a Splunk search processing language (SPL) command that is used to extract fields from raw data based on a pattern that you specify using regular expressions.

REX can be used for a variety of tasks, such as:

Extracting fields from log files
Parsing data from structured files
Transforming data into a different format
REX is a powerful tool that can be used to make Splunk more efficient and effective.

Here are some examples of how to use REX in Splunk:

Extract the IP address from a web server log line:
[sourcetype=web_server]
| rex field=_raw match=”(?<ip>\d+\.\d+\.\d+\.\d+)" format="$ip"</ip>