Chapter 9: Big Data Analytics for Managing Risk - Vocabulary Flashcards by Justin T

Big data

Sets of data that are too large to be gathered and analyzed by traditional methods

How well did you know this?

Not at all

Perfectly

Structured data

Data organized into databases with defined fields, including links between databases

How well did you know this?

Not at all

Perfectly

Unstructured data

Data that is not organized into predetermined formats, such as databases, and often consists of text, images, or other nontraditional media

How well did you know this?

Not at all

Perfectly

Data science

An interdisciplinary field involving the design and use of techniques to process very large amounts of data from a variety of sources and to provide knowledge based on the data

How well did you know this?

Not at all

Perfectly

Internal data

Data that is owned by an organization

How well did you know this?

Not at all

Perfectly

External data

Data that belongs to an entity other than the organization that wishes to acquire and use it

How well did you know this?

Not at all

Perfectly

Economic data

Data regarding interest rates, asset prices, exchange rates, the Consumer Price Index, and other information about the global, the national, or a regional economy

How well did you know this?

Not at all

Perfectly

Geodemographic data

Data regarding classifications of a population

How well did you know this?

Not at all

Perfectly

Data-driven decision making

An organizational process to gather and analyze relevant and verifiable data and then evaluate the results to guide business strategies

How well did you know this?

Not at all

Perfectly

Predictive model

A model used to predict an unknown outcome by means of a defined target variable

How well did you know this?

Not at all

Perfectly

Training data

Data that is used to train a predictive model and that therefore must have known values for the target variable of the model

How well did you know this?

Not at all

Perfectly

Target variable

The predefined attribute whose value is being predicted in a data analytical model

How well did you know this?

Not at all

Perfectly

Class label

The value of the target variable in a model

How well did you know this?

Not at all

Perfectly

Attribute

A variable that describes a characteristic of an instance within a model

How well did you know this?

Not at all

Perfectly

Instance (example)

The representation of a data point described by a set of attributes within a model’s dataset

How well did you know this?

Not at all

Perfectly

Overfitting

The process of fitting a model too closely to the training data for the model to be effective on other data

How well did you know this?

Not at all

Perfectly

Holdout data

In the model training process, existing data with a known target variable that is not used as part of the training data

How well did you know this?

Not at all

Perfectly

Generalization

The ability of a model to apply itself to data outside the training data

How well did you know this?

Not at all

Perfectly

Accuracy

Study These Flashcards

In model performance evaluation, a model’s correct predictions divided by its total predictions

Precision

Study These Flashcards

In model performance evaluation, a model’s correct positive predictions divided by its total positive predictions

Recall

Study These Flashcards

In model performance evaluation, a model’s correct positive predictions divided by the sum of its correct positive predictions and incorrect negative predictions

F-score

Study These Flashcards

In statistics, the measure that combines precision and recall and is the harmonic mean of precision and recall

Supervised learning

Study These Flashcards

A type of model creation, derived from the field of machine learning, in which the target variable is defined

Unsupervised learning

Study These Flashcards

A type of model creation, derived from the field of machine learning, that does not have a defined target variable

Machine learning

Artificial intelligence in which computers continually teach themselves to make better decisions based on previous results and new data

Segmentation

An analytical technique in which data is divided into categories

Association rule learning

Examining data to discover new and interesting relationships among attributes that can be stated as business rules

Classification tree

A supervised learning technique that uses a structure similar to a tree to segment data according to known attributes to determine the value of a categorical target variable

Leaf node

A terminal node of a classification tree that is used to classify an instance based on its attributes

Arrow

A pathway in a classification tree

Node

A representation of a data attribute

Algorithm

An operational sequence used to solve mathematical problems and to create computer programs

Linear regression

A statistical method to predict the numerical value of a target variable based on the values of explanatory variables

Generalized linear model (GLM()

A statistical technique that increases the flexibility of a linear model by linking it with a nonlinear function

Link function

A mathematical function that describes how the random values of a target variable depend on the mean value generated by a linear combination of the explanatory variables (attributes)

Cluster analysis

A model that determines previously unknown groupings of data

Artificial Intelligence (AI)

Computer processing or output that simulates human reasoning or knowledge

Social network analysis

the study of the connections and relationships among people in a network

Neural network

A data analysis technique composed of three layers, including an input layer, a hidden layer with nonlinear functions, and an output layer, that is used for complex problems

Complex claim

A claim that contains one or more characteristics that cause it to cost more than the average claim

Information gain

A measure of the predictive power of one or more attributes

Recursively

Successively applying a model

Root node

The first node in a classification tree

Combination of nodes

A representation of a data attribute in a classification tree

Lift

In model performance evaluation, the percentage of positive predictions made by the model divided by the percentage of positive predictions that would be made in the absence of the model

Chapter 9: Big Data Analytics for Managing Risk - Vocabulary Flashcards

(45 cards)