Booz Terms Flashcards Preview

CAP Exam > Booz Terms > Flashcards

Flashcards in Booz Terms Deck (45)
Loading flashcards...
1
Q

Active Learning

A

Intelligent sample selection to improve performance of model. Samples are selected to provide the greatest information to a learning model.

2
Q

Agent Based Simulation

A

Simulates the actions and interactions of autonomous agents.

3
Q

ANOVA

A

Hypothesis testing for differences between more than two groups.

4
Q

Association Rule Mining (Apriori)

A

Data mining technique to identify the common co-occurances of items.

5
Q

Bayesian Network

A

Models conditional probabilities amongst elements, visualized as a Directed Acyclic Graph.

6
Q

Collaborative Filtering

A

Also known as ‘Recommendation,’ suggest or eliminate items from a set by comparing a history of actions against items performed by users. Finds similar items based on who used them or similar users based on the items they use.

7
Q

Coordinate Transformation

A

Provides a different perspective on data.

8
Q

Deep Learning

A

Method that learns features that leads to higher concept learning. Usually very deep neural network architectures.

9
Q

Design of Experiments

A

Applies controlled experiments to quantify effects on system output caused by changes to inputs.

10
Q

Differential Equations

A

Used to express relationships between functions and their derivatives, for example, change over time.

11
Q

Discrete Event Simulation

A

Simulates a discrete sequence of events where each event occurs at a particular instant in time. The model updates its state only at points in time when events occur.

12
Q

Discrete Wavelet Transform

A

Transforms time series data into frequency domain preserving locality information.

13
Q

Ensemble Learning

A

Learning multiple models and combining output to achieve better performance.

14
Q

Expert Systems

A

Systems that use symbolic logic to reason about facts. Emulates human reasoning.

15
Q

Exponential Smoothing

A

Used to remove artifacts expected from collection error or outliers.

16
Q

Factor Analysis

A

Describes variability among correlated variables with the goal of lowering the number of unobserved variables, namely, the factors.

17
Q

Fast Fourier Transform

A

Transforms time series from time to frequency domain efficiently. Can also be used for image improvement by spatial transforms.

18
Q

Format Conversion

A

Creates a standard representation of data regardless of source format. For example, extracting raw UTF-8 encoded text from binary file formats such as Microsoft Word or PDFs.

19
Q

Fuzzy Logic

A

Logical reasoning that allows for degrees of truth for a statement.

20
Q

Gaussian Filtering

A

Acts to remove noise or blur data.

21
Q

Generalized Linear Models

A

Expands ordinary linear regression to allow for error distribution that is not normal.

22
Q

Genetic Algorithms

A

Evolves candidate models over generations by evolutionary inspired operators of mutation and crossover of parameters.

23
Q

Grid Search

A

Systematic search across discrete parameter values for parameter exploration problems.

24
Q

Hidden Markov Models

A

Models sequential data by determining the discrete latent variables, but the observables may be continuous or discrete.

25
Q

Hierarchical Clustering

A

Connectivity based clustering approach that sequentially builds bigger (agglomerative) or smaller (divisive) clusters in the data.

26
Q

K-means and X-means Clustering

A

Centroid based clustering algorithms, where with K means the number of clusters is set and X means the number of clusters is unknown.

27
Q

Linear, Non-linear, and Integer Programming

A

Set of techniques for minimizing or maximizing a function over a constrained set of input parameters.

28
Q

Markov Chain Monte Carlo (MCMC)

A

A method of sampling typically used in Bayesian models to estimate the joint distribution of parameters given the data.

29
Q

Monte Carlo Methods

A

Set of computational techniques to generate random numbers.

30
Q

Naive Bayes

A

Predicts classes following Bayes Theorem that states the probability of an outcome given a set of features is based on the probability of features given an outcome.

31
Q

Neural Networks

A

Learns salient features in data by adjusting weights between nodes through a learning rule.

32
Q

Outlier Removal

A

Method for identifying and removing noise or artifacts from data.

33
Q

Principal Components Analysis

A

Enables dimensionality reduction by identifying highly correlated dimensions.

34
Q

Random Search

A

Randomly adjust parameters to find a better solution than currently found.

35
Q

Regression with Shrinkage (Lasso)

A

A method of variable selection and prediction combined into a possibly biased linear model.

36
Q

Sensitivity Analysis

A

Involves testing individual parameters in an analytic or model and observing the magnitude of the effect.

37
Q

Simulated Annealing

A

Named after a controlled cooling process in metallurgy, and by analogy using a changing temperature or annealing schedule to vary algorithmic convergence.

38
Q

Stepwise Regression

A

A method of variable selection and prediction. Akaike’s information criterion AIC is used as the metric for selection. The resulting predictive model is based upon ordinary least squares, or a general linear model with parameter estimation via maximum likelihood.

39
Q

Stochastic Gradient Descent

A

General-purpose optimization for learning of neural networks, support vector machines, and logistic regression models.

40
Q

Support Vector Machines

A

Projection of feature vectors using a kernel function into a space where classes are more separable.

41
Q

Term Frequency / Inverse Document Frequency

A

A statistic that measures the relative importance of a term from a corpus.

42
Q

Topic Modeling (Latent Dirichlet Allocation)

A

Identifies latent topics in text by examining word co-occurrence.

43
Q

Tree Based Methods

A

Models structured as graph trees where branches indicate decisions.

44
Q

T-Test

A

Hypothesis test used to test for differences between two groups.

45
Q

Wrapper Methods

A

Feature set reduction method that utilizes performance of a set of features on a model, as a measure of the feature set’s performance. Can help identify combinations of features in models that achieve high performance.