12. Explainable AI - AI II Flashcards
(52 cards)
Can a Multi-Layer Perceptron (MLP) with one hidden layer solve the following logic functions?
a) AND
b) OR
c) XOR
Answer: a, b, and c
MLPs can solve all Boolean logic functions, including AND, OR, and XOR, which are important for non-linearly separable problems.
How many weights (excluding biases) does a fully connected feedforward neural network have with:
4 input nodes
4 hidden nodes
2 output nodes
Options:
a) 32
b) 9
c) 18
d) 24
Answer: d) 24
Formula: (4×4) + (4×2) = 24
The first term (4×4) represents connections between input and hidden layer.
The second term (4×2) represents connections between hidden and output layer.
Why do we need an MLP (Multi-Layer Perceptron) to solve the XOR problem?
A single-layer perceptron (SLP) cannot solve XOR because XOR is not linearly separable.
An MLP with a hidden layer can learn non-linear decision boundaries using activation functions and hidden neurons.
The hidden layer transforms input space into a linearly separable feature space.
Why does an MLP need a hidden layer for solving the XOR function?
Complete the missing values in this truth table.
Why is an MLP needed for XOR?
A Single-Layer Perceptron (SLP) can only separate inputs linearly.
XOR is not linearly separable, meaning you cannot draw a straight line to separate the outputs.
An MLP adds a hidden layer, which allows it to transform the data into a space where XOR becomes linearly separable.
How does the hidden layer work?
The hidden layer has two neurons (h1 and h2) that act as intermediate steps.
The network learns weights to transform the XOR problem into a solvable format.
Step-by-Step: How It Works
Inputs (x1, x2) go into the hidden layer (h1, h2).
The hidden layer learns patterns:
If one input is 1 → Activates one hidden neuron.
If both inputs are the same (0,0 or 1,1) → Activates both or none.
The output layer takes h1 & h2 and calculates y:
If only one hidden neuron is active, the output is 1 (which matches XOR).
If both or none are active, the output is 0.
For a Convolutional Neural Network (CNN) consisting of 5 feature maps with a filter size of 3×3, how many weights are there in the first layer?
Each feature map requires a 3×3 = 9 weight filter. Since there are 5 feature maps, the total number of weights is:
5 × 9 = 45
Correct answer: b) 45
How many neurons are in a feature map in the first layer of a Convolutional Neural Network (CNN) when using a 3×3 filter on a 5×5 image, with stride = 1?
Step-by-Step Explanation:
Understanding the filter movement:
A filter (or kernel) is a small window (3×3 in this case) that moves over the input image.
The filter starts at the top-left corner and slides across the image.
Stride = 1 means the filter moves one pixel at a time in both directions.
How to calculate output size (feature map size):
The formula to calculate the output feature map size is:
Outputsize=(Imagesize−Filtersize)+1 Applying it to our case: (5−3)+1=3
This means the output feature map is 3×3.
Total neurons in the feature map:
Since each position in the 3×3 feature map corresponds to a neuron, the total number of neurons is:
3×3=9
Correct answer: a) 9 neurons
How many neurons per layer and in total does a CNN have if given the following specifications?
Input image size: 7×7
5 input filters of 2×2
Feature map stride: 1
Pooling dimension: 2×2
Pooling stride: 2
Classes: 3
Feature map calculation:
Filter size 2×2, stride 1, input 7×7
Output feature map size: (7−2)/1 + 1 = 6 × 6
Each feature map has 6×6 = 36 neurons
5 feature maps, so total neurons in this layer: 5 × 36 = 180
Pooling layer calculation:
Pooling reduces 6×6 → 3×3
Total pooling layer neurons: 5 × 9 = 45
Fully connected layer:
One neuron per class → 3 neurons
Total neurons:
180 (conv) + 45 (pooling) + 3 (fc) = 228 neurons
Final Answer: 228 neurons
Which of the following non-derivative functions can only produce outputs in the range 0 to 1?
a) Linear activation
b) ReLU activation
c) Sigmoid function
d) Softmax function
Correct answers: c) Sigmoid function & d) Softmax function
Explanation:
The sigmoid function squeezes any input into a range between 0 and 1, making it useful for probability-based tasks.
The softmax function does something similar but for multiple outputs—ensuring that all output values sum to 1, which is perfect for classification problems.
Linear activation can produce values outside the 0-1 range.
ReLU activation outputs 0 for negative inputs and the same value for positive inputs, meaning it can go beyond 1.
How many weights (excluding biases) does a feedforward fully connected neural network with:
5 input nodes
3 hidden nodes
1 output node
No biases
have?
Correct answer: b) 18
Explanation:
Input to Hidden Layer:
Each of the 5 input nodes connects to 3 hidden nodes.
That means 5 × 3 = 15 weights.
Hidden to Output Layer:
Each of the 3 hidden nodes connects to 1 output node.
That means 3 × 1 = 3 weights.
Total weights:
15 (input to hidden) + 3 (hidden to output) = 18.
Which of the following are considered activation functions in neural networks?
a) Step function (Hard limiter)
b) Softmax function
c) Boolean logic function
d) Dropout
Correct answers: a) Step function & b) Softmax function
Explanation:
Step function (Hard limiter) is a basic activation function that outputs either 0 or 1 depending on whether the input passes a certain threshold.
Softmax function converts values into probabilities (between 0 and 1) so that they sum to 1, making it useful in classification problems.
Boolean logic functions (like AND, OR, XOR) are not activation functions; they are logic gates used in digital circuits.
Dropout is not an activation function; it’s a regularization technique that prevents overfitting by randomly ignoring some neurons during training.
Why are deep learning models difficult to trust?
Deep learning models act as black boxes, meaning we can’t see exactly how they make decisions.
They use millions of parameters that are hard to interpret.
Lack of transparency makes it difficult to understand, debug, or justify their predictions.
This is critical for high-stakes fields like healthcare and finance.
In what applications is XAI especially important?
Healthcare → Doctors need to trust AI diagnoses.
Finance → Loan approvals must be fair and transparent.
Legal System → AI predictions must be explainable in court.
Autonomous Vehicles → Understanding why a car made a decision is crucial for safety.
What are data explainers?
AI models rely on data, so understanding the dataset is crucial.
Poorly understood data can lead to bias and errors.
Example:
An imbalanced dataset can cause AI to favor one class over another.
Checking feature distributions (e.g., histograms) helps improve performance.
How do we explain AI models?
Deep learning is often a black box.
Model explainability techniques help visualize how AI makes decisions.
Example: Heatmaps show which parts of an image influence classification.
Helps make AI models interpretable and trustworthy.
What does Explainable AI (XAI) mean?
XAI aims to make AI model decisions understandable to humans.
Focuses on:
Comprehensibility → Can stakeholders understand it?
Interpretability → Does the model make logical sense?
Helps build trust, fairness, and transparency in AI systems.
How can dataset issues impact AI performance?
AI models perform poorly if trained on biased or unbalanced data.
Example: If certain classes are underrepresented, the model may struggle to predict them.
Solution: Analyze dataset distributions and apply balancing techniques.
What is the difference between intrinsic and post hoc model explainability?
Intrinsic Explainability: The model is designed to be both high-performing and explainable from the start (e.g., decision trees, interpretable deep learning).
Post Hoc Explainability: The model is first trained, and then explainability techniques (XAI) are applied to interpret its predictions.
Key idea: Intrinsic models are built for transparency, while post hoc techniques explain already-trained models.
What is the difference between global and local explainability in AI models?
Global Explainability: Describes how the entire model makes decisions across all inputs.
Example: Feature importance analysis across the whole dataset.
Local Explainability: Explains individual predictions for specific data points.
Example: Why did the model classify a specific image as a cat?
Key idea: Global gives an overview of model behavior, while local focuses on single decisions.
What are different ways AI models can explain their decisions?
Visual explanations - Highlight important areas in images (e.g., heatmaps).
Mathematical/Computational explanations - Use logic functions and probability for developers.
Language-based explanations - Use natural language to describe model reasoning for end-users.
Key idea: Different users need different types of explanations.
How do we visualize what a Convolutional Neural Network (CNN) has learned?
CNN filters detect patterns (edges, textures, objects).
Feature maps highlight important parts of an image.
Pooling layers reduce dimensions but keep key features.
Fully connected layers combine learned features for final classification.
Key idea: CNNs break images into patterns, layer by layer, to recognize objects.
How can we visualize what an Artificial Neural Network (ANN) is representing?
In the first layer, visualizing the values of the weights is straightforward.
Deeper layers require more advanced techniques, as simply showing weights between layers may not be meaningful.
One approach is to visualize the entire receptive field of neurons in later layers.
Techniques like DeepDream can generate images that show what a neuron is “looking for.”
What is DeepDream, and how does it explain neural networks?
Post hoc method: Applied after training to understand what neurons have learned.
Global explainability: Shows representations for entire neurons/layers.
Model-specific: Mostly used for CNNs.
It generates images that highlight patterns a neuron is detecting in data.
Each square in DeepDream output represents what different feature maps in a deep layer are focusing on.
What is SHAP (Shapley Additive Explanations), and how does it work?
Post hoc method: Analyzes a trained model.
Local & Global: SHAP can explain individual predictions (local) and feature importance across datasets (global).
Model-agnostic: Works with any AI model, not dependent on architecture.
Shows how much each feature contributes to a decision using heatmaps (e.g., highlighting pixels in an image that influenced classification).