ExamQuestions Flashcards
(193 cards)
What is the difference between Q-learning and SARSA?
Both used in reinforcement learning. Q-learning is off-policy, SARSA is on-policy.
Use SARSA: When safety during exploration matters (e.g., robot navigation near obstacles).
Use Q-learning: When learning optimal policy is more important than short-term realism (e.g., games, simulations).
Why is reproducibility important in AI?
To verify and trust scientific results and models.
What is the MEU principle?
Maximum Expected Utility – choose the action with highest expected utility.
What is the role of the learning rate α in RL?
It determines how much new information overrides old estimates.
What is a transition matrix?
It defines the probabilities of moving from one state to another.
What is the importance of AI transparency?
To ensure accountability and trust in AI systems.
What is batch normalization?
A technique to normalize layer inputs for faster and more stable training.
Why is the decision node not allowed to influence chance nodes?
Because decisions don’t change the underlying state, only actions.
In a decision network, a decision node is not allowed to influence a chance node because that would imply the agent controls a random event directly, which violates the principle of modeling uncertainty. The probability that it rains tomorrow shouldn’t change based on the robot’s choice.
What is the difference between hidden and observed variables in HMMs?
A Hidden Markov Model (HMM) is a statistical model used to describe systems that evolve over time with hidden internal states that produce observable outputs.
States (S) – The internal, unobservable states of the system (e.g., “sunny” or “rainy” when you are inside).
Observations (O) – What you can see or measure (e.g., someone carrying an umbrella).
Transition Probabilities (P(Sₜ | Sₜ₋₁)) – Probability of moving from one state to another.
Emission Probabilities (P(Oₜ | Sₜ)) – Probability of an observation given a state.
Initial State Distribution (P(S₀)) – Probability of starting in each state.
What is backpropagation?
Backpropagation is the algorithm used to train neural networks.
It tells us how to adjust the weights in the network to reduce the error between predicted and actual output.
An algorithm to compute the gradient of the loss function w.r.t. each weight.
Why is it important that a Bayesian network is a DAG?
Because it avoids cycles, which would make probability calculations inconsistent.
What are convolutional filters?
Small weight matrices applied across an input to detect local features. A convolutional filter is a small matrix (like 3×3 or 5×5) of weights that is slid over an image (or feature map) to detect patterns, like edges, textures, or other features.
What is a heuristic?
A heuristic is a strategy or rule-of-thumb that helps an algorithm make decisions faster by estimating how close a state is to the goal. In AI search, a heuristic is a function:
h(n)=estimatedcostfromnodentoagoal
What is tokenization?
Splitting text into words or sub-word units.
What is stemming?
Reducing words to their root forms. The goal is to group different forms of a word so they can be treated as the same during tasks like text classification or search. Playing -> play
How does observing a collider activate a path?
A collider is a node that has two incoming arrows. A → C ← B
Without any observation: A and B are independent.
A and B become dependent given C
Suppose:
A = “burglary”
B = “earthquake”
C = “alarm goes off”
If you know the alarm went off (C), learning there was a burglary (A) makes it less likely there was also an earthquake (B) — they now compete to explain the same event.
What is algorithmic bias?
Systematic errors in decision making due to biased training data.
Algorithmic bias occurs when an algorithm systematically produces unfair, prejudiced, or discriminatory outcomes — usually because it has learned patterns from biased training data or is influenced by biased assumptions in its design.
What does it mean if an MDP has a stationary policy?
Markov Descition Process. A stationary policy is a decision strategy where: The action the agent chooses in each state does not change over time. The best action depends only on the current state.
What is L2 regularization?
L2 regularization adds a penalty term to the loss function to discourage the model from learning large weights. It doesn’t change the goal of minimizing error — it just adds a “cost” to making the model too complex.
A penalty on the squared values of the weights to reduce overfitting.
What are the three main problems solved by HMMs?
Evaluation: Compute the probability of an observed sequence, using Forward algorithm
Decoding: Find the most likely sequence of hidden states, using Viterbi algorithm
Learning: Adjust model parameters to best explain observed data, using Baum-Welch algorithm (an EM algorithm)
What is an expert system?
An expert system is a type of AI program designed to replicate the decision-making ability of a human expert in a specific domain. It uses rules, facts, and a reasoning engine to draw conclusions and solve problems.
What is an episode in RL?
An episode is a complete sequence of interactions between an RL agent and the environment, starting from an initial state and ending in a terminal state (or after a set number of steps).
It’s like one full run of the agent trying to achieve its goal — e.g., finishing a game, navigating a maze, or completing a task.
What is the backward algorithm used for?
The backward algorithm is one of the core algorithms used in Hidden Markov Models (HMMs). The backward algorithm is used to compute the probability of the ending portion of the observation sequence, given a current state. If the system is in state s_t at time t, what is the probability of seeing the observations from time t+1 to the end? Computes probability of future observations from a given state.
What is a recurrent neural network (RNN)?
A Recurrent Neural Network (RNN) is a type of neural network designed to work with sequential data, such as time series, text, or speech. Unlike standard neural networks, RNNs have memory — they can retain information from previous inputs to help influence future predictions.