ML Soc-Tech Flashcards

Question

What is a key advantage of decision trees?

Answer 1

Easy to use with minimal pre-processing required

Answer 2

Sensitive to overfitting and can produce unstable results

Answer 3

* Root node * Split node * Branch * Leaf node

Answer 4

The level of impurity or randomness in the dataset

Answer 5

The reduction in entropy achieved by splitting the dataset on a particular feature

Answer 6

The process of intervening to prevent overfitting

Answer 7

* Pre-pruning * Post-pruning

Answer 8

Quantifies the contribution of each feature to the predictive power of the model

Answer 9

* Mean Squared Error (MSE) * Explained Variance * Mean Absolute Error (MAE)

Answer 10

* Accuracy * Precision * Recall * Matthew’s Correlation Coefficient (MCC)

Answer 11

The structure of the human brain

Answer 12

Highly versatile and can model non-linear relationships

Answer 13

Opaque, making it unclear what the model has learned

Answer 14

Weights and biases are adjusted to identify useful patterns in the data

Answer 15

They process data through neurons using weights, biases, and activation functions

Answer 16

To produce the final prediction

Answer 17

Measures the difference between predicted and actual values

Answer 18

Bias nodes ensure input, even if input space is 0

Answer 19

Measures the (global) performance of the model given the data

Answer 20

* Regression: MSE, RMSE * Classification: Cross-entropy

Answer 21

Can approximate any continuous function on a closed and bounded domain

Answer 22

* Features must be scaled * Categorical features must be dummy-encoded * Missing values must be handled

Answer 23

Computes how the total loss is generated from each node in each layer

Answer 24

One full pass through the entire training dataset

Answer 25

Iteration: One forward and backward pass through network for a batch

Answer 26

* Batch gradient descent * Stochastic gradient descent * Mini batch gradient descent

Answer 27

The sum of the absolute values of the weights

Answer 28

Finding the optimal combination of hyperparameters to improve model performance

Answer 29

To more rigorously evaluate the generalisation performance

Answer 30

A collection of sub-models (weak models) that combine to become more powerful

Answer 31

A technique used to reduce variance and improve robustness by creating multiple versions of a dataset

Answer 32

Reduces correlation between trees by using diverse feature subsets

Answer 33

An ensemble technique where models are trained sequentially to correct the errors of their predecessors

Answer 34

To represent discrete data in a continuous, lower-dimensional vector space

Answer 35

* Association * Temporal precedence * Nonspuriousness

Answer 36

A prediction task where the test data is drawn from the same distribution as the training data

Answer 37

* Interpretability: Passive characteristic of a model * Explainability: Active characteristic of a model

Answer 38

Interpretability is a passive characteristic of a model that refers to how well a model makes sense to a human observer, while explainability is an active characteristic that involves actions taken by a model to clarify its internal functions.

Answer 39

* Accuracy * Fidelity * Consistency * Stability * Comprehensibility * Certainty/Novelty * Degree of Importance * Representativeness

Answer 40

The ability of a model to be simulated or comprehended entirely by a human.

Answer 41

The proportion of individuals who did not recidivate who were incorrectly predicted to recidivate.

Answer 42

To identify simplified/transparent models representing specific sections of or approximating a more complex model.

Answer 43

if-then rules

Answer 44

Alternative data points that would change the model’s prediction to a desired outcome.

Answer 45

Visualizes the relationship between a feature and the predicted outcome while marginalizing over other features.

Answer 46

* Easy to understand * Simple to implement * Clear interpretation if features are uncorrelated * Provides a causal interpretation for the model

Answer 47

Graphs showing how predictions for individual instances change as a feature varies while keeping other features fixed.

Answer 48

Local Interpretable Model-agnostic Explanations.

Answer 49

As the change in the expected model prediction when conditioning on that feature, comparing the prediction with and without the feature.

Answer 50

The direct contribution of an input to the variance of the output, ignoring interactions with other inputs.

Answer 51

* Potential for misuse * Privacy invasion * Reinforcement of stereotypes

Answer 52

* Discrimination, hate speech * Information hazards * Malicious uses * Environmental and socioeconomic harms

Answer 53

* Biometric categorization systems using sensitive characteristics * Untargeted scraping of facial images * Emotion recognition in workplaces and educational institutions * Social scoring based on behavior or characteristics

Answer 54

* Forecasting electricity supply * Reducing waste in electricity grids * Predictive maintenance in transport

Answer 55

Visualizes how each feature contributes to an individual prediction by showing the additive contributions of features step-by-step.

Answer 56

Gives a global view of feature importance across the entire dataset by summarizing all individual SHAP values.

ML Soc-Tech Flashcards

(83 cards)