Fully-connected Neural Networks Flashcards
(14 cards)
What is a feedforward (fully-connected) neural network?
A generalization of a single neuron: a sequence of layers where each node in layer L takes inputs from all nodes in layer L−1.
Why are non-linear activation functions necessary in neural networks?
Without non-linearity, stacked layers collapse into one linear transformation, so depth would add no representational power.
How is layer computation expressed in matrix form for two layers?
Z¹ = B¹ + W¹ · Z⁰; Z² = B² + W² · Z¹, showing each layer applies an affine transform to the previous outputs.
How do you create a linear layer in PyTorch and with what parameter initialization?
Use torch.nn.Linear(in_features, out_features); by default weights are initialized from a normal distribution.
How do you flatten MNIST images of shape (1,28,28) for a batch of size 32?
Apply tensor.flatten(start_dim=1) to get a tensor shape of (32, 784) before feeding into a linear layer.
How is a single-layer network defined for 10-digit classification?
nn.Linear(in_features=784, out_features=10), as MNIST has 784 inputs and 10 output classes.
What are the shapes of weight and bias parameters in a linear layer?
Weights: (out_features, in_features); Biases: (out_features).
Provide the formulas for the sigmoid and tanh activation functions.
σ(z) = 1/(1+e^(-z)); tanh(z) = (e^z − e^{-z})/(e^z + e^{-z}).
Why are sigmoid and tanh less used in hidden layers?
They saturate quickly (derivatives near zero), leading to vanishing gradients in deep networks.
What are the ReLU and Leaky ReLU activation formulas?
ReLU(x)=max(0,x); LReLU(x)=x if x≥0 else αx (α≈0.01).
How do you define a custom PyTorch Module composed of submodules?
Subclass nn.Module, initialize submodule layers in __init__, and define the forward pass chaining them.
What is the softmax function formula for multi-class classification?
P(i)=e^{ŷ_i}/Σ_{c=1}^C e^{ŷ_c}, converting logits to probability distribution over C classes.
What is the cross-entropy loss formula shown in the notebook?
L = -log(e^{ŷ_j}/Σ_{c=1}^C e^{ŷ_c}), where j is the true class index.
How do you transfer model and data to a GPU in PyTorch?
Use tensor.to(device) or model.to(device) with device = torch.device(‘cuda’) to move them to GPU memory.