Artificial Intelligence Flashcards

Question

Define a complete assignment in CSP

Answer 1

If all variables are assigned.

Answer 2

A complete and consistent assignment.

Answer 3

Eliminating parts of the search space through constraints, enforcing local consistency. Making a choice limits the domain (search space) of other variables.

Answer 4

A recursive mechanism to handle 'dead end' scenarios (empty domains). Solution gradually developed by trying options at different tree levels.

Answer 5

Initialise an assignment of all variables randomly; address resulting constraint violations by adjusting one variable at a time.

Answer 6

Builds a model (a simplified mathematical representation of a phenomenon). Learning is the process of training a model type using data.

Answer 7

Supervised, Unsupervised, and Reinforcement learning.

Answer 8

Learning based on observation (𝑋) – label (𝑌) pairs. Output 𝑌 is known for training data.

Answer 9

Classification: output 𝑌 takes a finite set of values (class labels). Regression: output 𝑌 can be any number on a continuous spectrum.

Answer 10

Generative models explicitly model the distribution of classes and features (e.g., Naive Bayes). Discriminative models implicitly model the decision boundary.

Answer 11

Parametric: model represented by a fixed number of parameters. Non-parametric: model represented by training samples (e.g., K-Nearest-Neighbour).

Answer 12

When a model has too much expressive ability (parameters) and models the training data too closely, including noise, causing it to not generalise well on unseen data.

Answer 13

The ability of the model to perform well on unseen data (test data) after training.

Answer 14

Tension between models that describe training data well (low bias, more complex) and models robust to change with data fluctuations (low variance, simpler).

Answer 15

A probabilistic model for classification. Properties: classification, generative, parametric.

Answer 16

Conditional independence of features (𝑋1, ..., 𝑋𝑛) given the class Y.

Answer 17

If training data lacks a sample of a certain class with a specific feature value, the conditional probability can be zero, affecting inference.

Answer 18

A technique to address Naïve Bayes training data limitations by adding a constant (e.g., 1) to counts of feature values to avoid zero probabilities.

Answer 19

A representation of a function that solves Boolean classification problems. Can be learnt using a greedy (best-first) approach with the information-gain heuristic.

Answer 20

Using a greedy divide-and-conquer strategy, choosing the most discriminative attribute first to maximise separation of samples.

Answer 21

Can be prone to overfitting (techniques like pruning are used to mitigate).

Answer 22

Measure of uncertainty (in bits) of a variable. H(X) = − Σ P(xk)log2P(xk).

Answer 23

The reduction in entropy after splitting on an attribute. We want the attribute that reduces entropy the most.

Answer 24

The simple unit of neural networks. It is a binary (two-class) linear classifier.

Answer 25

Universal approximators of arbitrary continuous functions.

Answer 26

2 competing players take turns, discreet states/moves, clear/quantifiable outcomes, zero-sum property, uses terminal test and utilities of game over states.

Answer 27

A function that assigns values to final states of a game.

Answer 28

An algorithm for 2-player, zero-sum games assuming one player minimizes utility and the other maximizes. Finds an optimal move at every position.

Answer 29

An optimisation for Minimax that explores the game tree to give upper and lower bounds to the optimal score possible, allowing branches that are not worth exploring to be 'pruned'.

Answer 30

To give values to states without calculating the full minimax tree, used when minimax is not feasible.

Answer 31

Builds up sentences from atomic sentences using logical connectives.

Answer 32

Logically irreducible statements which can be either true or false.

Answer 33

Unary: ¬ (not); Binary: ∧ (and), ∨ (or), → (implies), ↔ (if and only if).

Answer 34

A full assignment of truth values to atomic sentences.

Answer 35

In every model in which α is true, β is also true.

Answer 36

A set of sentences that summarizes the information known about the world.

Answer 37

An algorithm to determine if KB ⊨ α by checking all possible worlds where KB is true; if α is true in all of them, then KB ⊨ α.

Answer 38

It is both sound and complete, but its complexity is exponential (2^n models for n atomic sentences).

Answer 39

A faster way than model checking using a set of inference rules to derive true sentences from a knowledge base.

Answer 40

An algorithm i is sound if deriving α from KB (KB ⊢_i α) always means that KB ⊨ α.

Answer 41

An algorithm i is complete if KB ⊨ α always means that KB ⊢_i α.

Answer 42

A common proof system using a single inference rule. To prove α from KB, show no model exists where KB and ¬α both hold.

Answer 43

Sound and complete if all sentences are converted to Conjunctive Normal Form (CNF).

Answer 44

The set of all possible outcomes for an experiment.

Answer 45

A subset of the sample space that we are interested in.

Answer 46

The probability of one event happening given that another event has happened. P(A|B) = P(A ∩ B) / P(B).

Answer 47

P(Y|X) = P(X|Y)P(Y) / P(X). Used to infer a cause (Y) from an effect (X).

Answer 48

A graphical model that concisely displays relations of dependence, independence, and conditional independence between variables.

Answer 49

A property of probabilistic systems indicating they are 'memoryless'; the transition probability to the next state depends only on the current state and action, not the history.

Answer 50

A model defined using the Markov Property, consisting of a set of states, actions from each state, transition probabilities, and a reward for each transition.

Answer 51

Standard search problems can be thought of as a special case of MDPs where the transition probability to one specific next state is 1 for a given state and action.

Answer 52

A way to evaluate sequences of states and actions to maximise long-term reward, using a discount rate (gamma).

Answer 53

The average score or outcome likely to be achieved from a probabilistic experiment or random variable.

Answer 54

An algorithm to calculate the optimal policy in an MDP by iteratively updating the value of each state based on the maximum expected value over all possible actions from that state.

Artificial Intelligence Flashcards

(78 cards)