Lecture 6 - Low Level Vision Flashcards

Question 1

Q

Recap

Answer

A

visual processing begins at retina, gets transformed into chemical signal by photoreceptors which is trnasmittied by ganglion cells and passes this info via optic nerve to optic chiasm
at optic chiasm info from one side of hemiretina visual field is kept laterally but infro from nasal hemiretina crosses over
info passes througH LGN which contains cells with similar receptor field properties of ganglion cells via optic raditions to the V1 (striate)
simple cells can be orientation specific for edge/bar/angle with off or on centres. emerge from adjacent LGN field inputs.
complex cells no spatially fixed on/off fields and are more dynamic responding to broad visual stimulus presented to any part of the receptive field
orientation preferenve in V1 appears to be displayed in orientation columns. ocular dominance columns are perpendicular to orientation columns. together form one hypercolumn (a cortical processing module for a stimulus that falls within a particular retinal area)

Question 2

Q

Feature detection

Answer

A

model suggests we have indiv cells responding to specific features
the only info this cell passes on is firing rates, more = cell prefers this stimulus
problem of univariance = assumes cell has only one way of responding to stimulus. therefore output of single cells are ambiguous

Question 3

Q

fourier analysis

Answer

A

alt framework that suggests these early independent single cells are part of independent channels that represent different aspects of visual system
each channel conveys info contained in the image at a specific spatial scale and orientation
the visual system deconstructs the image into discrete channels before recombining them to form coherent representation
suggests any complex signal can be constructed from a set of simpler sinusoidal functions e.g. variations in frequency, amplitude, starting point etc e.g. by adding certain mixtures of luminance, frequency and amplitude of waves can recreate visual frequency that represents an image

Question 4

Q

spatial frequency channels

Answer

A

contrast sensitivity = measure how well someone sees detail across different spatial frequencies.
sensitivity is max between 2-5 cpd - you need less contrast in an image to resolve pattern at the peak. High contrast needed at each end to resolve the pattern
good measure of visual performance across spatial frequencies
Blakemore and Campbell (1969) - measured contrast threshold. higher the value the easier detection of an image. when introducing contrast stimulus adaptation for 60s repeated presentation means neurons that respond to this will habituate and gradually become less sensitive - happens within 60s. Then, after this adaptation assessed spatial frequency and see massive change in performance due to the adaptation = specific neurons in visual system habe habituated.
higher sensitivity the lower the threshold you need for detecting a stimulus
after adaptation would need specific spatial frequency channels which can be detected.
> threshold is inc (sensitivity dec) for spatial frequency similar to adapting frequency
> following adaptation get dips in contrast around the spatial frequency point
This selective adaptation effect implies the existence of multiple overlapping spatial frequency channels

Question 5

Q

Devalois et al (1982)

Answer

A

measures electrophysiology of neurons in V1.
contrast sensitivity functions of V1 cells in macaque monkeys
found cells have different but slightly overlapping spatial frequency selectivity from same locations on retina = bar detecting neurons that seem to do fourier analysis of the visual image.

Question 6

Q

computational vision

Answer

A

Marr - 3 levels of analysis to understand how a system works
1. computational level - what is the goal of the visual system?
2. algorithmic level - what rules/representations achieve this?
3. Implementational level - how is it achieved phyisically
– vision serves multiple goals inc recognising objects, determining where thet are, what shape and how to interact with them. each relies on numerous algorithmic steps
one step is edge detection in determining object boundaries
marr’s model of object perception:
1. grey level rep
2. primal sketch
3. 2.5D sketch
4. 3D sketch

Question 7

Q

Marr and hildreth model

Answer

A

algorithm assumes edges of an object coincide with gradients in luminance
need to transform this info to make edge explicit so would take the first derivative (expression that tells us the slope of a tangent line to the curve at any instant - shows if there are peaks and valleys of sharp luminance change) telling us rate of signal change. if first der is pos then must be inc, if neg then must dec. shows peaks/valleys where there is an intensity gradient
need to determine which peak/valley you are interested in. finding them is difficult and how high/low exactly?
taking the decond derivative gives us zero crossings where there is a luminance gradient (tells us rate of change in signal as move from L to R) where there is a luminance gradient in original signal get a zero crossing.
edge is represented by zero crossing in second derivative
this process will be negatively affected by high freq noise in image = may see zero crossings where there is no meaningful gradient so would need a smoothing process - blur it then do second derivative (blurs and removes high freq content) expressed as guassain operator G.
the peakness (sigma) of G determines level of blurring. higher G = more blurring to image = better results and more accurate edge detection
can be sequential process but Marr suggests can do all steps (luminance, smooth, first deriv, second deriv) in parallel with laplacian of Gussain filter - when plotted shows a similar on/off centre shape
this algorithm gives first steps of edge detection anc can be implemented at bio level in LGN
so retinal and LGN receptive fields can be cosnidered spatial filters that compute the second derivative of an image

Question 8

Q

rapid edge detection

Answer

A

visual edges are important in low level visual processing
Paradiso and Nakayama (1991) used temporal mask paradigm - obersvers view luminance target (white disc) followed by annular mask (small white ring) and judge luminance. showed target then followed by mask
pps report a composite image and report the outside refion looks bright but inside is dark = mask interferes with perception.
suggests we perceive edges first then fill in image and mask stops this filling in process.
seems low level visual preception is edge based

Question 9

Q

beyond luminance edges

Answer

A

first order edges = differences detected by luminance gradient
second order edges = complex edges determined by difs in visual texture (invisible to marr and hildreth model)

Question 10

Q

texture segmentation

Answer

A

Julesz - developed model based on statistics of local conspicuous image features - textons
based on idea that differences in statistics of certain textons lead to something jumping out at you e.g. lines, line terminations or junctions (T and X junctions)
> effortless segmentation = difs in conscpicuous elements e.g. line crossings
> difficult segmentation = no dif in conscpiuous elements
Northfurft (1985) - noticed similarities between luminance segmentation (first order) and texture segmentation. same thing happens - changing spacing gets same results. seems that texture segmentation is more similar to luminance segmentation dependent on spatial scales
> the effects of critical spatial factors e.g. element size and spacing = similarities to luminance
> texture segmentation for luminance and structure difs might be achieved by evaluation of a gradient.

Question 11

Q

bergen and adelson (1988)

Answer

A

demonstrated that texture segmentation can be enhanced/impaired by adjusting sizes of elements
foreground jumps out from background due to difs in textons but making L shape shorter means info contained at dif spatial scales becomes more important and makes spatial freq more similar to surround
evidence against Texton theory - suggests simpler filtering process can account for segmentation. what matters most is the info contained at different spatial scales within foreground relative to background.
mapping onto neurophysiology?
Lamme et al used sungle cell recording in V1 and found response enhancement at texture figure and edge = texture perception begins at edge and fills in inwards
temporal sequence of neural responses revelaed separate processes depending on background edge or foreground.
> initial response 50ms
> local orientation tuning - simple cell response to orientation 58ms
> enhancement of edge 90ms
> enhancement in response in figure 112-123ms
consistent with edge based segmentation process that drives filling in of texture surface
edge enhancement at 90ms suggests feedback from beyond V1