Cribl Admin CCOE Flashcards

Question

Cribl.Cloud advantages

Answer 1

Simplified administration Simplified distributed architecture Git preconfigured Automatic restarts and upgrades Simplified access management and security Transparent licensing

Answer 2

Using common data sources that are pre-configured (TCP, Splunk, Elastic, etc) Using ports 200000-200100 that are available to receive data

Answer 3

Single Stream instance Distributed Stream instance with Leader on-prem & workers in the Cribl.Cloud

Answer 4

Simplified administration Git preconfigured Automatic upgrades

Answer 5

20000-20010

Answer 6

Event Breaker Rulesets Number of Routes Number of Pipelines Number of Clones Health of Destinations Persistent Queueing

Answer 7

Allocate 1 physical core for each 400GB/day of IN & OUT throughput 100GB in -> 100GB out to 3 destinations=400GB total. 400GB/400GB=1 physical core

Answer 8

Persistent Queueing requirements

Answer 9

11 Worker Nodes

Answer 10

16vCPU per Worker Node

Answer 11

Scale up with higher system performance (CPU, Ram, Disk) on a single platform Scale out with additional platforms Add more worker groups

Answer 12

Many medium size Worker nodes

Answer 13

1. System Down 2. Install Git on Backup Node 3. Recover configuration from remote repository 4. Restart Leader Node 5. Back Operational :)

Answer 14

1. Set up GitHub 2. Create an empty crypto repository 3. Generate keys to connect Stream to GitHub (Public key>GitHub/Private Key>Stream) 4. Configure Stream UI to connect to Remote Git 5. Once connected, each time a change is made to local to sync with the remote repository

Answer 15

id_ed25519.pub

Answer 16

Optional for all Stream Deployments

Answer 17

Rsync Tar / untar Copy configuration files to S3, rehydrate configuration files from S3

Answer 18

Requires manual configuration outside of Cribl Stream configuration

Answer 19

Git init Git fetch origin

Answer 20

To provide a backup of configuration files To provide a history of changes within Stream

Answer 21

Displays a list of all the available commands

Answer 22

./cribl start ./cribl stop ./cribl restart ./cribl status (shows Stream status) ./cribl diag (manages diagnostic bundles)

Answer 23

CLI gives you the ability to run commands without needed access to the GUI Helps in creating automated scripts if needed Gives you the ability to run diagnostics and send them to Cribl Support

Answer 24

boot-start

Answer 25

Creates a gzip file with configuration information and system state

Answer 26

./cribl mode-master

Answer 27

systemctl start cribl

Answer 28

Files in the local directory Log files State of the system Details about the system running Stream

Answer 29

Sources will have a red status on Leader until they are deployed to a worker group. Status can still be red if there are binding issues

Answer 30

Make sure JavaScript filter set for the live capture is correct. If no data is returned, the problem is likely with the network or further upstream

Answer 31

Ping the server? Using nc or telnet command, test the connection source

Answer 32

Check by going to the Destination in Monitoring>Destinations and clicking on Status. If the Source is connected via a Route to a Destination that is triggering backpressure, set to Block to stop sending data.

Answer 33

Typos? Proper authentication?

Answer 34

Stream can accept data pushed to it, or pull data via API calls Open protocols, as well as select proprietary products, are supported Pulling data falls into two categories * Scheduled pulls for recurring data (think tailing a file) * Collector jobs intended for ad hoc runs as in Replay scenario Push Sources push to us such as Splunk, TCP Internal sources are internal to us such as Datagens or Internal logs/metrics Low-code interface eases management Capture sample data at any stage to validate and test

Answer 35

Stream can process a syslog stream directly Moving to Cribl Stream from existing syslog-ng or rsyslog servers fully replaces those solutions with one that is fully supported and easily managed Optimze syslog events Syslog data is best collected closest to the source Use a load balancer to distribute load across multiple worker nodes Reduce management conplexity while ensuring reliable and secure delivery of Syslog data to chosen systems

Answer 36

Beats are open-source data shippers that act as agents. Most popular with Cribl customers: Filebeat - filebeat.yml Winlogbeat - Winlogbeat.yml

Answer 37

Having the collector stuck in a forever running state

Answer 38

Elastic Beats Splunk Forwarder

Answer 39

'setup.ilm.enabled: false'

Answer 40

Any destination

Answer 41

Both UDP and TCP traffic on Port 9514

Answer 42

Stream Collectors are a special group of inputs that are designed to ingest data intermittently rather than continuously. Collectors can be scheduled or run ad-hoc Cribl Stream Collectors supports the following data types

Answer 43

Azure Blob Google Cloud Storage REST S3 Splunk Search Health Check Database File System Script

Answer 44

-Prepares the infrastructure to execute a collection job -Discovers the data to be fetched -Fetches the data that match the run filter -Passes the results either through the Routes or into a specific Pipeline

Answer 45

-The Worker Node execute the tasks to its entirety -The Leader Node oversees the task distribution and tries to maintain a fair balance across jobs -Cribl Stream uses "Least-In-Flight Scheduling" -Because the Leader manages Collectors' state, if the Leader instance fails, the Collection jobs will fail as well.\

Answer 46

A Worker Node can have multiple worker processes running to collect data. Since the data is spread across multiple worker processes, an alternative like Redis is required to perform stateful suppression and stateful aggregation

Answer 47

Discovers what data is available based on the collection settings

Answer 48

Collects the data based on the settings of the discovery phase

Answer 49

Multiple processes that process data independently

Answer 50

The Leader Node sends work to Workers based on previous distributions of work.

Answer 51

Scheduled or AdHoc

Answer 52

S3 Collector and REST Collector

Answer 53

Accept events in real time

Answer 54

accept events in groups or batches

Answer 55

For each destination type, you can create multiple definitions, depending on your requirements. Definitions include Block, Drop, Queue

Answer 56

Not all data is of equal value. High volume low value data can be sent to less expensive destinations

Answer 57

1. Simplify data analytics tools migration 2. Store everything you may need in the future, analyze only what you need now

Answer 58

Data collected once can be sent to multiple destinations without extra operations cost to run new agents

Answer 59

1. Quick time to value 2. Operations cost reduction

Answer 60

Reduce troubleshooting effort

Answer 61

1. Minimize data loss 2. Eliminate/minimize the need to introduce separate buffering/queueing tools

Answer 62

Splunk Single Instance - Stream data to a single Splunk instance Splunk Load Balanced - Load balance the data it streams to multiple Splunk receivers (indexers) Splunk HEC - Can stream data to a Splunk HEC (HTTP Event Collector) receive through an event endpoint

Answer 63

Multi-metrics is data sent in JSON format which allows for each JSON object to contain measurements for multiple metrics. Takes up less space and improves search performance

Answer 64

Adjust timeout settings for slow connections. Increase request concurrenct based on HEC receivers

Answer 65

-Everything that is in _raw is viewable as event content -outside of _raw is metadata which can be searched with tstats or by including :: instead of = -Fields outside of _raw are viewe when event is expanded -If events do not have a _raw field, they'll be serialized to JSON prior to sending to Splunk

Answer 66

-Cribl Stream can send data to Splunk using a variety of different options -Data can be sent securely over TLS -Enabling multi-metrics can save space and perform better

Answer 67

Bulk API - Performs multiple indexing or delete operations in a single API call

Answer 68

Put all fields outside of _raw. use JSON

Answer 69

1. Create a policy > an index templatw 2. Each data stream's index template must include name or wildcard pattern, data stream's timestamo field, and mappings and settings applied to each 3. Source for data stream 4. Destination for data stream 5. Support for ILM

Answer 70

-Route data from multiple existing data sources or agents -Migrate data from older versions -Optimize data streams and send data in the right form to Elastic

Answer 71

Step 1: Configure Splunk Forwarder Step 2: Configure Splunk Source in Stream Step 3: Configure Elasticsearch Destination Step 4: Configure Pipeline (regex extract function, lookup function, GeoIP function) Step 5: Results

Answer 72

Stream does NOT have to run on AWS to deliver data to S3

Answer 73

Defines how files are partitioned and organized - Default is date-based

Answer 74

The output filename prefix - Defaults to CriblOut Use only with low cardinality partitions and understand impact to open files & AWS API

Answer 75

=Max Unique Values Number of Staging Sub-directories or S3 Bucket prefixes

Answer 76

When writing to S3 - too many open files and directories on worker nodes When reading from S3 - Less chance of hitting S3 read API limits

Answer 77

When writing to S3 - bigger files written to fewer directories in S3 When reading from S3 - Less filtering ability during replays, more data downloaded so larger data access charges, larger changer of hitting S3 read API limit

Answer 78

Plan for cardinality of no more than 2000 / partition expression

Answer 79

Sending data from Stream Worker to Stream Worker, not Worker to Leader

Answer 80

Receive data from Worker Groups or Edge Nodes Common for Customer-managed (on-prem) Worker sends data to a Worker in Cribl.Cloud Internal Cribl Sources treat internal fields differently than other Sources

Answer 81

Enables Edge nodes, and/or Cribl Stream instances, to send data to one or multiple Cribl Stream instances Internal fields loopback to Sources

Answer 82

-For maximum compression, it is best to change the data to JSON format -Internal Cribl Destinations must be on a Worker Node that is connected to the same leader as the internal Cribl Source -For minimum data transfer, process data on source workers instead of destination workers -For heavy processing, process data on destination workers

Answer 83

Can negatively impact both read and write API count Can dramatically increase number of open files Generally avoid unless you've done your due diligence and have low cardinality partition expressions All of the above

Answer 84

Compress data and reducing bandwidth Reducing Cloud provider egress costs

Answer 85

Destination workers

Answer 86

Output router Parquet Formation

Answer 87

Capturing data from overseas sources that is destined to local destinations Reducing the number of TCP connections to a destination Capturing data from a cloud provider and shipping it to an on-prem destination to avoid engress costs all of the above

Answer 88

Cardinality of partition and file name expressions Max open files on system

Answer 89

Less processing, smaller events, no metadata

Answer 90

-Allow you to use filters to send data through different pipelines. -Filtering capabilities via JavaScript expression and more control -Data Cloning allows events to go to subsequent route(s) -Data Cloning can be disabled with a switch toggle

Answer 91

-Enable expression > Toggle Yes -Enter JavaScript expresion that Stream will evaluate as the name of the Destination

Answer 92

Allows you to stop processing the data depending on the outcome. If an event matches the filter, and toggle is set to Yes, those events will not continue down to the next Route. Events that do not match that filter will continue down the Route

Answer 93

-Follow "Most Specific First" when using cloning -Follow "Most General First" when not using cloning -At the end of the route, you will see the "endRoute" bumper reminder

Answer 94

Route unreachable waarning indicator: "This route might be unreachable (blocked by a prior route), and might not receive data. Occurs when matching all three conditions: -Previous Route is enabled -Previous Route is final -Previous Route's filter expression evaluates to true

Answer 95

Filter Early and Filter fast! -you want to quickly filter out and data you do not want to process

Answer 96

-Certain JavaScript string operators run faster than others -Each of these functions operates similarly to each other, but slighty different: -indexof, includes and startswith use strings as their function parameter -match, search, and test use regular expressions

Answer 97

Most General: If cloning is not needed at all (all Final toggles stay at default), then it makes sense to start with the broadest expression at the top, so as to consume as many events as early as possible Most Specific: If cloning is needed on a narrow set of events, then it might make sense to do that upfront, and follow it with a Route that consumes those clones immiediately after Object Storage (S3 buckets): Since most data going to object storage is data being cloned, it is best to put routes going to object storage at the top. Filter on common fields. Filter on fields like inputid, and metadata fields, rather than _raw.includes

Answer 98

Navigate to the Source. Go to 'Connected Destinations'. Click on 'Routes' to revert to using them instead of QuickConnect. Create 2 routes: one to replace the old QuickConnect that was deleted, and a new route with a filter to map to the events of interest.

Answer 99

Filter early and filter fast!

Answer 100

-Routes have drag and drop capabilities to connect to a source to a destination; QuickConnect doesn't (FALSE) -QuickConnect has advanced capabilities to assign for assigning pre-processing pipelines to a source and post-processing pipelines to a destinations (FALSE) -QuickConnect does not allow mapping a Pack between sources and destinations (FALSE) -Routes map to a filter; QuickConnect maps a source to a destinatiosn (TRUE!!!!)

Answer 101

-Stream Syslog Source receiving events from hundreds of device types and applications (NOOOOOOOO) -Stream Splunk Source receiving events from Windows and Linux hosts with Splunk Universal Forwarders (NOOOOOO) -REST API Collector polling Google APIs with JWT authentication (NOOOOOO) -Palo Alto devices sending to a dedicated Stream Syslog Source mapping to a different port than other syslog events (YESSSSS)

Answer 102

Filter Expressions are used to decide what events to act upon in a Route or Function. Uses JavaScript language

Answer 103

typically used in Functions to assign a value. Uses JavaScript language

Answer 104

-Assigning a Value -Evaluating to a Value -Evaluating to true/false

Answer 105

Filter Expressions can be used in multiple places: -Capture -Routing -Functions within Pipelines -Monitoring Page

Answer 106

name.toLowerCase(): any uppercase characters in the field name get changed to lowercase name.replace("geoip_src_country", "country"): This is useful when JSON objects have been flattened (as in this case)

Answer 107

Expression methods can help you to help determine true or false. Here is a list of commonly used methods: .startswith: Returns true if a string start with the specified string .endswith: Returns true if a string ends with the specified string .includes: Returns true if a string contains the specified string .match: Returns an array containing the results if the string matches with a regular expression .indexOf: returns the position of the first occurrence of the substring

Answer 108

Cribl Expressions are native methods that can be invoked from any filter expression. All methods start with C. Examples: C.Crypto or C.Decode

Answer 109

Test your expression against sample data Test your expression against data you have collected Test your expression against data to see if it returns true or false Ensure your expresison is written correctly

Answer 110

">" "<" "==" "!=="

Answer 111

Functions within Pipelines Routes Monitoring Page Capture Page

Answer 112

"==" checks that the value is equal but "===" checks that the value and type are equal

Answer 113

Pipelines are a set of functions that perform transformations, reduction, enrichment, etc.

Answer 114

-Can improve SIEMs or analytics platforms by ingesting better data -Reduce costs by reducing the amount of data going into a SIEM -Simplifies getting data in (GDI)

Answer 115

Elastic LogStash Splunk props/transforms Vector Programming

Answer 116

Pre-Processing - Normalize events from a Source Processing - Primary pipeline for processing events Post-Processing - Normalize events to a Destination

Answer 117

This type is applied at the source Used when you want to normalize and correct all the data coming in Examples: -Syslog Pack pre-processing all syslog events coming from different vendors; specific product packs/pipelines can then be mapped to a route -Microservices pack pre-shapes all k8s, docker, container processed logs -Specific application pipeline/packs can then be mapped to routes

Answer 118

Most common use of pipelines you can associate pipeline to routes using filters

Answer 119

Maps to Destinations Universally post-shape data before it is routed Examples: -Convert all fields to JSON key value pairs prior to sending to Elastic -Convert all logs to metrics prior to sending to Prometheus -Ensure all Splunk destined events have the required index-time fields (index, source, sourcetype, host)

Answer 120

Name your pipeline and the route that attaches to it similarly -Create different pipelines for different data sets. Creating one big pipeline can substaintially use more resources, become unmanagable, and look confusing and complicated. -Filter early and filter fast! -Do not reuse pipelines. Do not use the same pipeline for both pre-processing and post-processing. Can make it hard to identify a problem and where it stems from -Capture sample events to test. Allows you to visualize the operation of the functions within a pipeline. -Test! Use data set to test and validate your pipeline -Use statistics. Use Basic Statistics to see how well your pipelines are working -Pipeline Profiling - determine performance of a pipeline BEFORE it is in production

Answer 121

-Functions act on received events and transform the received data to a desired output. -Stream ships with several functions that allow you to perform transformations, log to metrics, reduction, enrichment, etc. -Some expressions use JavaScrip -For some functions, knowning Regex will be required

Answer 122

Eval Sampling Parser Aggregations Lookup

Answer 123

Evaluate fields - Adds or removed fields from events Keep and Remove Fields - Keep fields take precedence over remove fields

Answer 124

It extracts fields out of events, or can be used to manipluate or serialize events

Answer 125

Types CSV - splits a field containing comma separated vvalues into fields Delimited Values - Similar to CSV, but using any delimiter Key=Value pairs - Walks through the field looking for key value pairs (key=value) and creates fields from them. JSON Object - Parses out a full JSON object into fields Extended Log Format - Parses a field containing an Apache Extended Log Format event into fields Parses a field containing an Apache Common Log Format event into fields

Answer 126

Looks to enrich your events from other data sources. Performs look ups against fixed databased such as CSV,CSV.GZ Theres three match modes: Exact, CIDR, regex Three match types for CIDR and Regex: First Match, Most specific, All GeoIP: Performs looks up against fixed databased like MMDB or Maxmind DNS Lookup: Performs DNS queries and returns the results Redis: Supports the entire REDIS command set

Answer 127

-Exact match will be case sensitive -Results will be added as fields in the event -Order your lookup from most specific to least -Create efficient regex -For DNS enrichment, use local caching DNS

Answer 128

allows you to apply statistical aggregation functions to the data to generate metrics for that data

Answer 129

which returns the average values of the parameter specified (for example, the parameter is a field that contains the number of bytes in, say a firewall transaction, avg will return the average number seen in the time window

Answer 130

Will similarly return the median (the "middle" number of the sorted values of the parameter within the time window)

Answer 131

each returns the minimum or maximum value, respectively, of the parameter within the time window

Answer 132

returns the specified percentile of the values of the specified parameter

Answer 133

returns the rate that the different values of the parameter occur at in the event window

Answer 134

Stream is a share nothing architecture.

Answer 135

duplicates events as they are passing through Stream

Answer 136

Mask/Replace/Redact patterns and events. helpful for masking personal information

Answer 137

Extract using regex named groups

Answer 138

Eval, parser, drop, aggregations, rename

Answer 139

Lookup, DNS lookup, GeoIP, Redis

Answer 140

Dynamic sampling, publish metrics, rollup Metrics

Answer 141

Chain, Clone, Code, Event Breaker, JSON Unroll, Tee, Trim timestampl, Unroll, XML Unroll

Answer 142

CEF serializer, flatten, serialize

Answer 143

-Use typeahead to get a list of functions you can use in JavaScript -You can use tooltips to get help on most fields in the UI by clicking the question mark -Add comments and descriptions to your functions in order to explain what is happening -Function groups allows you to group a set of functions together -Use the three dotes to access additional functions to a pipeline

Answer 144

CSV delimited values JSON Object SQL (incorret, it cannot parser this)

Answer 145

Aggregations

Answer 146

Packs let you Pack up and share Stream configurations and workflows across Worker Groups, or across organizations

Answer 147

Packs contain everything between a Source and a Destination

Answer 148

Sources Source Event Breakers Collectors Destinations Knowledge Objects

Answer 149

Make them useful for the community. Include sample files and lookups to ensure the community can test your pack Make them reusable. Make sure you include details on how to configure any relevant Sources and Destination

Answer 150

-start names with cc for community members. use all lower case letters. use dashes for separate words

Answer 151

-There is no concept of a Local directory inside the Data directory -Changes to Pack will create a local copy of that change -Local always wins over default -Making changes to routes will create a local version of route.yml

Answer 152

-Never delete anything in the default folder -If you delete items in default, they will reappear when you reload configs or restart the leader -Workaround: Untar the pack in the CLI, carefully delete things and update the appropriate references in the files, tar up the contents of the Pack from within the pack folder

Answer 153

-Never modify Knowledge objects that ship with the pack -If you modify any knowledge object that ships with the Pack, it will be overwritten. This includes lookups, etc. -Workaround - create a new knowledge object, any new knowledge object will not be overwritten

Answer 154

-Pack was updated but you cannot see any new updates or new features -since local has a higher preference, you will not see any of the new updates that are in the default -Workaround: delete and install the new pack, import the updated pack, import the pacck with a new ID each time you install a PAck update, merge local changes from the older pack into the newer pack

Answer 155

-Do not delete routes in a pack -you deleted all the routes in a pack and reinstalled the pack but the routes do not return -Workaround: delete the pack, restart the leader, reinstall the pack again

Answer 156

-review the README to understand Pack updates -Import the Pack with a separate/unique ID to see the new updates -Exporting a Pack with the merge option selected will overwrite defaults and will merge any local changes -The Cribl Knowledge Pack is a great way to learn more advanced functions in Stream

Answer 157

-import a file -import from a URL -import from Git -import from https://packs.cribl.io

Answer 158

-Enable plug and play deployments for specific use cases -Improve time to value by reducing hurdles and providing Cribl Stream users with out of the box pipelines -Target users in medium/large deployments sharing configurations and content across multiple worker groups

Answer 159

Pre-built configuration blocks designed to simplify the deployment and use of Cribl Stream

Answer 160

-Cribl creates packs and makes them available for Cribl Stream users -Partners and Users can create packs and make them available for Cribl Stream users -Downloaded packs can be edited for specific needs and then shared -ALL OF THE ABOVE IS CORRECT

Answer 161

Merge safe, Merge, and Default only

Answer 162

-Route data to cheap storage, Replay it back later -Search and Replay only the data you need -Send the Replayed data to any destination

Answer 163

Recommendation: Use Object Store Cost: Object Store is 70-95% cheaper than alternatives Metadata and searchability: Searching Object Store is a top choice for high volumes of data. Searching File storage is more appropriate for lower volumes of data. Volume: For high volumes of data, object or block storage are best Retrievability: Data is relatively retrievable from all three types of storage, though file and object storage are typically easier to access Handling of metadata: typically, best served by object storage

Answer 164

Recommendation: Use dedicated Worker Group No impact on Production Worker Nodes: Use dedicated Worker Group to process large amount of historical data and avoid impact on other workloads Egress: Place the Worker Group in the same Cloud provider as the Object Store (S3) and Destination Dynamic Scaling: If possible, use Dynamic Scaling, for example in Kubernetes

Answer 165

Recommendations: Partitining Expression on Destination should be the same as the Partitioning Expression on the Collector

Answer 166

Recommendation: Enable user friendly replays

Answer 167

Recommendation: Use Partitioning Expression in Search. Do not use content from within the events

Answer 168

Recommendation: Use a field to mark the data you want to Replay. Send Replayed data to any destination

Answer 169

Replay means jumping into critical logs, metrics and traces as far back in time as you want, and saying "let's see that again." Keep more data for longer retention periods and pay a lot less Replay data to any analytics tools for unexpected investigations Improve the quality and speed of your analytics environment by saving older data somewhere else Using Object Store (S3) is the most effective storage

Answer 170

-Reducing Analytics tool or SIEM spend -Making data available for other soultions -Replaying historical data for a threat hunting exercise -Replaying debug logs for a troubleshooting event Correct Answer is all of the above!

Answer 171

Use a unique Index name

Answer 172

Retrievability Handling of metadata Cost NOT permissions

Answer 173

Partitioning Expression filtering File name Expression filtering

Answer 174

-The edge is where we see the most data being generated -Use data directly from the edge without having to move it

Answer 175

-Able to install on Docker, Kubernetes, Linux, and Windows Servers -To install, go to Manage > Edge Nodes > Add/Update Edge Node -Provides customizable scripts for each operating system

Answer 176

-Kubernetes Logs (collects container logs and system logs from containers on a Kubernetes Node) -Kubernetes Events (collects cluster-level events from a Kubernetes Cluster -Kubernetes Metrics (collects events periodically based on the status and configuration of the Kubernetes cluster)

Answer 177

-System Metrics (collects metrics data including messages from CPU, Memory, Network, and Disk) -Journal Files (centralized location for all messages logged by different components in a systemd-enabled system)

Answer 178

-Windows Event Logs (collects standard event logs, including Application, Security, and System logs) -Windows Metrics (collects metrics data from Windows hosts)

Answer 179

-Enable Edge Nodes to send data to peer Nodes connected to the same Leader -Cribl HTTP (best suited for: Distributed deployments with multiple workers. Use of load balancers. Valuable in hybrid cloud deployments.) -Cribl TCP (best suited for: medium size deployments. All on prem. Valuable in certain circumstances)

Answer 180

-HTTP/TCP Destination must be on Edge Node connected to the same Leader as HTTP/TCP Source -Must specify same Leader Address on Edge Nodes that host Destination and Source -To configure Leader Address via UI > log into Edge Node's UI -Destinations Cribl endpoint must point to peer Address and Port of Source -When configuring hybrid workers, Edge Nodes that host Destination / Source must specify exact same Leader Address

Answer 181

1) Cribl Source to receive data from Edge Node 2) Configure Destination on Edge to send data to Stream 3) Configure Route to send your data to Stream

Answer 182

-Deploy to a variety of machines using provided scripts (ability to deploy to a wide variety of systems including Linux servers, Windows servers, Docker containers and Kubernetes) -Capture sources from a wide variety of systems (built in sources allows for quick and easy configuration to gather the data you need) -Combine with Cribl Strea (When using Edge with Stream, you unlock the power of Stream by using Workers to process the data)

Answer 183

-Open source, runtime-agnostic instrument utility for any Linux command or application -Offers APM-like, black-box instrumentation of an unmodified Linux executable and application -Interposes itself between applications and share libraries and system calls -Observe applications from the inside, viewing resource consumption, filesystem traffic and network traffic including clear text payloads

Answer 184

-AppScope gives you multiple ways to route collected data. The basic operations are: -in a single operation, you can route both events and metrics to Cribl Edge, default configuration -You can also route both events and metrics to Cribl Stream, local instance or in the Cribl.Cloud -Support routing events and metrics to a file, a local Unix socket or any network destination, in addition to Cribl Edge and/or Stream

Answer 185

-Go to Cribl.io, download from the top menu, download your preference. -Installing: Load and execute via CLI, done and ready to start working

Answer 186

Scope.yml is the sole library configuration file for AppScope. Environment vvariables override configuration settings

Answer 187

State 'Scoping' - the most basic command: /bin/echo another command: scope metrics scope events scope events 0 (gives info on that event) scope events -j | jq - events in JSON format

Answer 188

scope hist (defaults to last 20) to scope a specific session use the ID. example: scope hist --id 2

Answer 189

'scope perl' 'scope events' 'scope events --id 1 - fs.open' (file system events) -a says to output all events -j outputs events as JSON -jq filters down to just the file names sort and uniq helps us find only the unique filenames opened

Answer 190

bat log.py scope python3 log.py

Answer 191

scope sh -c 'echo "some bytes" | nc -w1 localhost 10001' scope metrics -m net.tx -m net.duration --cols

Answer 192

scope events -t net

Answer 193

scope flows scope flows ir1JM1 (flowID)

Answer 194

scope curl -so /dev/null http://localhost/ scope events

Answer 195

scope metrics -id 1 -g proc.cpu_perc scope metrics --id 1 -g -m proc.fd

Answer 196

Detailed Telemetry: automatically collects application performance data. Automatically collect log data written by the application Easy Management: Use the CLI when you want to explore in realtime, in an ad hoc way. Use the AppScope library (libscope) for longer-running, planned procedures Platform Agnostic: Offers ubiquitous, unified instrumentation of any unmodified Linux executable. Supports single-user or distributed deployments

Answer 197

scope hist

Answer 198

Step 1: A private key (a large prime #) is (always) created first using a took like openssl Step 2: Using the private key, a public key (another large prime #) is created and embedded in a Certificate Signing Request. This requires specifying minimum set of info: subject's name (CN=), org name, OU, city, state, country, and possibly subject alternative name (SAN) Step 3: The CSR is signed, either by it's own private key or a CA's key Step 4: You now have a certificate with a private key

Answer 199

-a cert cannot exist without being signed -public key (in signed certificate) can encrypt/verifiy data -Private key can decrypt/sign data -Caveat: Entity possessing the private key may not be the rightful owner

Answer 200

-CAs are used to sign Cert Signing Requests -Public vs Private - depends on the needs such as vetting levels, cost, cert visibility -The first/top-level CA is the root > assertion of trust -The second CA is an subordinate/intermediate - option but best practice

Answer 201

-Self signed certificates are not simply ones you sign yourself -self-signed cert is simply one signed by the same entity whose identity it certifies -Every root CA cert is self-signed -Every self-signed cert is also a root but not necessarily a CA -Still provides confidentiality, but authenticity and data integrity are suspect -CA-signed (public or private) certificates mititgates these issues -One step further is having the CA root cert deemed a trusted root by applications

Answer 202

Increasing trust as u go down the list: -Unsigned certs (no such thing) -Self-signed certs -Private CA-sgned certs -Public CA-signed certs -CA-signed certs whereby the CA is deemed trusted

Answer 203

-Chains exist when a non-self-signed certificate is involved -Many public CAs use chains to protect their root certs -Frequently used within organizations handling their own signing -Validating chains - starts at the bottom and moves up the chain to the root: issuer of each cert matches the subject of the next cert (except for the root) Each cert is signed by the private key corresponding to the next cert up the chain (except root) Last cert (top of the chain) is the trust achor

Answer 204

-Client and server applications are configured with a set of ciphers -Consist of multiple categories of algorithms -Many combinations exist as discrete suites -SSL/TLS versions have cipher suites associated with them -When a TLS version is released, new ciphers may be provided -Old ciphers can be deemed insecure > deprecated

Answer 205

1. Protocol: TLS in this example 2. Key Exchange: During the handshake the keys will be exchanged via ephermeral Ellitic Curve (EC) Diffie Hellman (ECDHE) 3. Authentication: ESDSA is the authentication algorithm 4. Bulk Encryption: AES_128_GCM (symmertric), specficially w/ Galois Counter Mode using a 128-bit key size 5. Hash: SHA-256

Answer 206

-Asymmetric encryption will be important any time you are looking to encrypt data from sources/destinations to most modern applications, including Stream -PKI involves a public key used to encrypt data and a private key used to decrypt the public key encrypted data -Certificates can be self-signed or signed by a Certificate Authority, self signed can be used for internal to internal encryption

Answer 207

Asymmetric

Answer 208

Symmetric encryption and Asymmetric encryption

Answer 209

Certificate Authority

Answer 210

-Cribl Stream encrypts secrets stored on disk -The keys used for encryption (cribl.secret) are managed by KMS -The keys are unique to each Worker Group + Leader -Encryption key can be managed by Cribl Stream or by an external KMS -Secrets encrypted by the Key: Sensitive information stored in configs and data encryption keys stored as configs

Answer 211

-Centralized key management for your organization -Change and access audit -High availability key management options -Minimizing key exposure

Answer 212

Stream Internal is the default KMS. Changing your KMS is not available with Stream free license -to get to KMS Settings: Settings > Security > Secrets -A System/Leader key; additional keys for each Worker Group -If HashiCorp Vault or AWS KMS are used, Leader and Worker Nodes must have network access to the external KMS

Answer 213

-Keys are set up separately at the Leader and each Worker Group levels to contain secrets access to the Worker Groups and the Leader -After KMS configuration is performed in Cribl Stream, the specified Secret Path will be created in the Vault

Answer 214

-Backup your cribl.secret files before switching to external KMS -Switching from external to internal KMS while the external KMS is not accessible may render your Cribl Stream environment unusable -If an external KMS is used, Leader AND Worker Nodes must have access to the external KMS to operate -Test your KMS configuration in a non-production Cribl Stream environment

Answer 215

Separately, in the Leader Node and Worker Groups settings

Answer 216

HashiCorp Vault AWS KMS

Answer 217

-Authenticate Client (mutual auth) - if true, server requests client cert. Off by default -Validate Client - Clients whose certs aren't authorized (i.e. signed by built-in CAs) have connection denied. Off by default -Mutual auth enables optional CN validation via regex

Answer 218

-Leaf cert expiration and validation of CA chain then -CN / SAN checks per RFCs -Only one is checked, regardless of no matches. SAN checked first, if values exist. -IPs are only accepted if theya re in both SAN and Subject attributes

Answer 219

-Stream as a client can validate the remote server certification using Validate server certs toggle -Some destinations (like AWS) allow rejecting unauthorized (example is self-signed certs) -If GUI does not provide a Reject Unauthorized toggle, then a global one can be used (Requires a restart and must be included in systemd unit file)

Answer 220

generating a self-signed certificate with openssl

Answer 221

-For self-signed, simply add the cert to the Certificate field -Preferably, use the CA Certificate field for importing one or more CA certs. Pros: avoids using NODE_EXTRA_CA_CERTS. Cons: not obvious trusted CA certs are associated with this host cert -Sub/root CA certs can be added to the Certificate field

Answer 222

-Worker nodes should appear identical to external systems -Worker nodes should internally reflect their individuality for better security -API and cluster settings on a node can use the same cert reflecting the worker's name -Subject (CN is hostname) and SAN should be defined -Use the SAN to include all possible names -Manage certs via UI == each worker gets the full cert set

Answer 223

-Separate (from API/custer) certs can and should be used (managed via GUI) for src/dst configs to reflect the worker group's FQDN -Two Options: Single cert for all workers, or different cert on each worker -Former is more scalable due to wildcard but validationfails if connecting with IPs -Depending on details (example key size) some systems may not accept the configured cert -For both options, trusted root CA (vs internal CA) is preferred and possibly required

Answer 224

-TLS can also be configured for Worker to LEader comms using the intance yaml file, environment variables, or via CL -yaml config will be done via the $CRIBL_HOME/local/_system/instance.yml file under the distributed section

Answer 225

Logs: -$CRIBL_HOME/log/cribl.log -Certs/TLS errors will be logged here. If workers are not showing up on the leader check the worker logs for cert errors. -$CRIBL_HOME/local/_system/instance.yml -Contains TLS settings, helpful if the workers are not connecting Tools -openssl s_client -connect host:9000 -This will give you details of the certificate being presented on the port, can be useful to verify the certificate details

Answer 226

-TLS can be a complicated feature to enable, proper planning and having a basic understanding of TLS client server architecture can help -There are multiple places that TLS can be used -Worker to Leader, Source to Worker, Worker to Destination, Leader GUI -Have a means to track certificate issuance and expiration -Use the Stream logs to assist in troubleshooting TLS problems

Answer 227

-Worker -Leader -NOT Client or Browser

Answer 228

-Configure Cribl Member -Create a Cribl Member user with the correct access to Stream and other products -Provide the new Cribl Member access to their Worker Group -Configure Stream Project -Create a Subscription -Create a Data Project using the Subscription above -Add available destinations to the project -Assign Users -Give a Cribl member permissions to the Stream Project

Answer 229

-Provides control over who has access and visibility within Cribl Projects -Compliments current authentication methods but will eventually replace them Settings > Global Settings > Access Management > Members

Answer 230

Worker Group > Group Settings > Access Management > Members

Answer 231

Admin: Full Access Editor: Can modify resources within the group Read Only: has read only access to resources within the group User: no access unless shared

Answer 232

Worker Group > Projects > Subscription

Answer 233

Worker Group > Projects > Data Projects

Answer 234

-Cribl Admin can provide teams/users with specific data without mdoifying data for other users -Cribl Members provide granular access to Cribl products including Stream, Edge and Search -Stream Projects enable users to have control over their data by providing granular access to data flowing through Cribl Stream

Answer 235

the team can share complex Cribl Stream data through the subscription

Answer 236

-Metrics are a number respresentation of data measured over intervals of time -Metrics can be an incredibly useful and important part of your observability strategy -Many logging systems extract and calculate metrics -Cribl Stream can extract metrics that are not always available

Answer 237

-Logs can take up a lot of space and come from multiple systems -Metrics tend to be leaner and faster -Solution: Calculate metrics to send to analytics system, and archive the rest

Answer 238

-Cribl Stream pipelines contain functions to aggregate or transform logs to metrics -Extract data from a log line, convert that data to metrics -Three different functions -Aggregate -Publish metrics -Rollup metrics

Answer 239

-Traces represent the end to end request flow through a distributed system -the data structure of traces looks almost like an event log -Traces are made up of spans. Spans are events that are apart of a trace

Answer 240

-In App Monitoring, traces represent what applications spend time on -Used by app developers to measure and identify least performant calls in code -Trace generation and analysis is often done by APM tools

Answer 241

Each span begins with: traceid, name, id

Answer 242

Cribl Stream can receive and route data without having to stitch, remove irrelevant data, create metric data Cribl Stream can process raw OptenTelemetry data without app-level changes. Also store raw data indefinitely (such as AWS S3)

Answer 243

-API/main process in $CRIBL_HOME/log/directory -Config Helper process in $CRIBL_HOME/log/group/GROUPNAME directory

Answer 244

-API process in $CRIBL_HOME/log/directory -Worker process in $CRIBL_HOME/log/worker/WP#/directory

Answer 245

Pro: Easy to use Cribl Stream to send its own logs Cons: if something isn't working, logs might not get sent

Answer 246

-Leader itself doesn't process data, so it can't forward its own logs -You can use any file collection option, such as Elastic Filebeat, Splunk Universal Forwarder, Cribl Edge, etc. -Logs can be collected from the leader via /system/logs API endpoint

Answer 247

-Logs can be viewed on disk, Leader UI, or Forwarding -You have control over logging level or redaction -Forwarding can be convienent but has trade offs

Answer 248

$CRIBL_HOME/log

Answer 249

-cribl.log -access.log -audit.log -notifications.log

Answer 250

$CRIBL_HOME/log

Answer 251

$CRIBL_HOME/log/worker/[wp#]/

Answer 252

-Single-Instance (upgrade the instance) -Distributed Deployment: Upgrade the leader, then the Workers, Commit and Deploy

Answer 253

-Default files will be overwritten (check for modifications and custom functions) -Download package and checksum files if not using CDN

Answer 254

Step 1: Stop Stream Step 2: Back up $CRIBL_HOME (optional) Step 3: Uncompress new version over the old one Step 4: Start Stream Step 5: Validate your Stream environment

Answer 255

Step 1: Commit and Deploy (git push to remote repo (optional)) Step 2: Upgrade the Leader (stop Stream, back up $CRIBL_HOME, uncompress new version over the old one, Start Stream Step 3: Upgrade the Worker Nodes (wait for all the Workers to report to the leader, stop, uncompress new version over the old one, Start Stream) Step 4: Commit new software version changes (ensure that all workers have reported with new version, commit & Deploy after verifying all workers are upgrade)

Answer 256

Stream Settings > System > Upgrade

Answer 257

-Cloud Leader and Workers will be automatically upgraded -Disable Automatic upgrades only applies to customer-managed workers

Answer 258

-Upgrade is an install of a new version over the old -You have the option of manual, UI, or automatic upgrade -UI Upgrade of workers can be done separately for each worker grou -You can control how each worker group is upgrade -Cribl-managed cloud leader and workers upgrade automatically

Answer 259

Overwritten

Answer 260

Stop > Uncompress > Start

Answer 261

-Cribl CDN -Local path on the server -HTTP URL

Answer 262

-Upgraded after the leader -Automatically upgraded -Upgraded by worker group

Answer 263

-Single instance deployment can run without Git -No change tracking or rollbacks -Mandatory on the leader node for distributed deployments

Answer 264

-Track configuration changes -Compare configuration versions -Selective commits -Restore previous configuration version

Answer 265

-Make your repository private -Use .gitignore to exclude wht gets pushed to Git

Answer 266

Git -Single-instance is option -Distributed is mandatory -Diff/Commit/Undo/Rollback Setting up and using Git remote repository -Make your repository private -exclude large files

Answer 267

1: Make changes in the Development system UI 2: Commit and push changes to remote repository (dev branch) 3: When ready to push changes into Production, create Pull request to move changes from the dev branch to the production branch 4: Merge Pull Request 5: Send notification to Stream to "sync" changes

Answer 268

-Follow instructions located at docs.cribl.io -Set up remote git repo as normal on dev -Push initial config from dev -Create dev and prod branches

Answer 269

-Use secure protocols such as HTTPS or SSH -HTTPS using username/password authentication -SSH uses public/private keys -Ensure your user accounts are only scoped for least priviledge acces

Answer 270

-When using SSH, the private key is stored as $CRIBL_HOME/local/cribl/auth/ssh/git.key -SSH uses a known_hosts file located at /home/cribl/.ssh/known_hosts -Import server public keys using the following command (as the cribl user): ssh-keyscan -H >> ~/.ssh/known_hosts

Answer 271

-Git will validate SSL certificates when using HTTPS transport -You should leave this validation enabled -Self-signed or internal PKI will result in validation failure -Import non-public CA signed certs for SSL validation

Answer 272

-Stream allows for automatic commits and push to remote repository on a scheduled basis -At a minimum you should set up automatic push -you can find this configuration under Leader>Git Settings> Scheduled actions

Answer 273

-Git can be problematic with large files -Disable tracking of large lookups by adding files to the .gitignore file in $CRIBL_HOME -Excluding SSL certificates managed by Stream may cause issues on workers -Only add exclusions below the CUSTOM SECTION header

Answer 274

-Stream's remote Git push is not a replacement for a comprehensive server backup strategy -Items outside of $CRIBL_HOME are not tracked inside the Git repository -Sync files to an S3 bucket for example

Answer 275

-Use secure protocols for transport -Protect authentication keys and use least privileged access -Add certificates for SSL validation (if required) -Set up a scheduled push to the remote repository -Exclude large lookup files -Git is not a comprensive backup strategy for the Leader node

Answer 276

$CRIBL_HOME/.ssh/known_hosts

Answer 277

1. Binding to a priviledge port 2. Too many open files 3. Out of memory 4. Cloning workers 5. resetting lost passwords 6. pipeline profiling

Answer 278

-Stream should be running as a non root using -If Cribl Stream is required to listen on ports 1-1024, it will need privileged access. You can enable this on systemd by adding this configuration key to your override.conf file: AmbientCapabilities=CAP_NET_BIND_SERVICE

Answer 279

EMFILE too many open files -When creating partitions avoid high cardinality fields in your expression Raise the number of files -For the following destinations, configure Max File options to avoid errors: Filesystem/NFS, Azure Blob, Google Cloud, Amazon S3 Increase Ulimit for Max Open Files (NOFILE) -Edit systemd file to contain a line similar to the one here: LimitNOFILE=20248

Answer 280

Out of Memory (OOM) errors are shown in the cribl_stderr.log file Lookups Aggregations

Answer 281

Worker GUID -When you first install and run the software, Cribl Stream generate a GUID which it stores in a .dat file located in $CRIBL_HOME/local/cribl/auth -When deploying Cribl Stream as part of a host image or VM, be sure to remove this.dat file, so that you do not end up with duplicate GUIDs. Cribl Stream will regenerate the file on the next run

Answer 282

Cribl.secret file is located in $CRIBL_HOME/local/cribl/auth.cribl.secret

Answer 283

blah blah blah

Answer 284

Privileged Port Binding -lower level port privleges Too many open Files -high cardinality path naming Out of Memory -Aggregations overloading memory Cloning Workers -Removal of DAT file containing the GUID Lost Passwords -Plaintext Password replacement in Users.json Pipeline Profiling -Helps with troubleshooting pipeline related issues

Answer 285

-Run as root (ONE IS WRONG) -IPtables (ONE IS WRONG) -Systemctl settings THIS IS CORRECT

Answer 286

$CRIBL_HOME/local/auth/users.json

Answer 287

-High Cardinality Naming -High number of incoming connections -Large amount of persistent queuing

Answer 288

Lookups and Aggregations

Answer 289

$CRIBL_HOME/local/cribl/auth/*.dat

Answer 290

-/proc/sys/fs/file-max -systemd/system/cribl,service -/etc/sysctl.conf -/etc/security/limits.conf

Answer 291

anything lower than 1024

Answer 292

-Run collector jobs -Receives data from sources -sends data to destinations NOT backs up to Git (local only)

Answer 293

20000-20010

Answer 294

-Persistant Queuing -Volume of data incoming? number of destinations??? I think this answer is wrong i think correct answer might be type of data processing required

Answer 295

-Distributed Stream instance with Leader in Cribl.Cloud and Workers on prem -Distributed Stream instance with Leader and workers in Cribl.Cloud

Answer 296

7 Worker Nodes

Answer 297

-Integration with Okta for Authentication -GitHub Integration

Answer 298

Stream TCP and Stream HTTP

Answer 299

Higher reliability unlimited scalability

Answer 300

Stream to Stream

Answer 301

-Reducing Analytics tool or SIEM spend -Replaying historical data for threat hunting exercise

Answer 302

the name of your pipeline

Answer 303

Splunk Logstash WRONG

Answer 304

Splunk HEC TLS (WRONG ANSWER)

Answer 305

Filebeats and Winlogbeats

Answer 306

Collectors

Answer 307

Data is captured prior to sending to a destination

Answer 308

Easier to report in Splunk (WRONG ANSWER)

Answer 309

-Sending a full-fidelity copy of an event to S3 and a transformed copy of the event to Splunk -Sending a filter of events to a Splunk instance, and a filter of other events to an Elastic Instance ONE OF THESE IS WRONG

Answer 310

-Setup a notification when destinations are unhealthy -Poll the REST API to see if any pipelines are dropping events WRONG ANSWER

Answer 311

Leader sends a request to the first available Worker node, Worker node sends a list of files back to the leader

Answer 312

-Netcat or wget from a worker to destination -run a capture and select 'before destination' within Cribl

Answer 313

All files will remain open until timeout or max file size is reached

Answer 314

Asymmetric

Answer 315

cribl.log notifications.log

Cribl Admin CCOE Flashcards

(473 cards)