Week 12: Geometric Deep Learning Flashcards

(4 cards)

1
Q

What is the key idea to geometric deep learning

A

There seem to be specific geometric principles, or geometric priors, that should be underlying in all architectures, if we care about building stable transformations in our representations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe the first geometric prior

A

The first geometric principle, or geometric prior, is symmetry:

Invariance: the properties of the object stay the same, after a transformation is applied (scale)

Key point: The final answer doesn’t change even though we transformed the input

Equivariance:
What happens: Cat image → scale down → the feature map also scales down in a predictable way
Key point: Both input AND output transform in the same predictable way

The CNN Example on the Right
Look at those filter diagrams:

y = shift operator * x
The black squares represent a pattern
Whether that pattern is on the left side or right side, the filter detects it
The detection result moves with the pattern → equivariance

Transformer Example
f(a,b,c) = f(a,c,b) = f(b,a,c) = …
This means: “The function gives the same result no matter how you reorder the inputs”

Like our addition example: 1+2+3 = 3+1+2 = 2+3+1
The transformer (without positional encoding) treats all positions equally

Why This Matters

Invariance: “I don’t care WHERE the cat is, just tell me it’s a cat”
Equivariance: “If the cat moves, my internal representation should move with it in a predictable way”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the second geometric prior

A

Geometric stability:
We can still handle our downstream task, even with sligt distortion and noise

“Automorphism group”: “transformations that preserve structure”

“Non-rigid & slight”: The deformations are flexible (not rigid like rotation) and small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe the third geometric prior

A

Scale separation:
The idea that real-world phenomena have structure at multiple scales simultaneously (fine details + medium patterns + coarse structure), and ML models should exploit this by processing different scales appropriately. Example: In images, edges → textures → objects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly