Lecture 6 - Low Level Vision Flashcards
1
Q
Recap
A
- visual processing begins at retina, gets transformed into chemical signal by photoreceptors which is trnasmittied by ganglion cells and passes this info via optic nerve to optic chiasm
- at optic chiasm info from one side of hemiretina visual field is kept laterally but infro from nasal hemiretina crosses over
- info passes througH LGN which contains cells with similar receptor field properties of ganglion cells via optic raditions to the V1 (striate)
- simple cells can be orientation specific for edge/bar/angle with off or on centres. emerge from adjacent LGN field inputs.
- complex cells no spatially fixed on/off fields and are more dynamic responding to broad visual stimulus presented to any part of the receptive field
- orientation preferenve in V1 appears to be displayed in orientation columns. ocular dominance columns are perpendicular to orientation columns. together form one hypercolumn (a cortical processing module for a stimulus that falls within a particular retinal area)
2
Q
Feature detection
A
- model suggests we have indiv cells responding to specific features
- the only info this cell passes on is firing rates, more = cell prefers this stimulus
- problem of univariance = assumes cell has only one way of responding to stimulus. therefore output of single cells are ambiguous
2
Q
fourier analysis
A
- alt framework that suggests these early independent single cells are part of independent channels that represent different aspects of visual system
- each channel conveys info contained in the image at a specific spatial scale and orientation
- the visual system deconstructs the image into discrete channels before recombining them to form coherent representation
- suggests any complex signal can be constructed from a set of simpler sinusoidal functions e.g. variations in frequency, amplitude, starting point etc e.g. by adding certain mixtures of luminance, frequency and amplitude of waves can recreate visual frequency that represents an image
2
Q
spatial frequency channels
A
- contrast sensitivity = measure how well someone sees detail across different spatial frequencies.
- sensitivity is max between 2-5 cpd - you need less contrast in an image to resolve pattern at the peak. High contrast needed at each end to resolve the pattern
- good measure of visual performance across spatial frequencies
- Blakemore and Campbell (1969) - measured contrast threshold. higher the value the easier detection of an image. when introducing contrast stimulus adaptation for 60s repeated presentation means neurons that respond to this will habituate and gradually become less sensitive - happens within 60s. Then, after this adaptation assessed spatial frequency and see massive change in performance due to the adaptation = specific neurons in visual system habe habituated.
- higher sensitivity the lower the threshold you need for detecting a stimulus
- after adaptation would need specific spatial frequency channels which can be detected.
> threshold is inc (sensitivity dec) for spatial frequency similar to adapting frequency
> following adaptation get dips in contrast around the spatial frequency point - This selective adaptation effect implies the existence of multiple overlapping spatial frequency channels
3
Q
Devalois et al (1982)
A
- measures electrophysiology of neurons in V1.
- contrast sensitivity functions of V1 cells in macaque monkeys
- found cells have different but slightly overlapping spatial frequency selectivity from same locations on retina = bar detecting neurons that seem to do fourier analysis of the visual image.
3
Q
computational vision
A
- Marr - 3 levels of analysis to understand how a system works
1. computational level - what is the goal of the visual system?
2. algorithmic level - what rules/representations achieve this?
3. Implementational level - how is it achieved phyisically
– vision serves multiple goals inc recognising objects, determining where thet are, what shape and how to interact with them. each relies on numerous algorithmic steps - one step is edge detection in determining object boundaries
- marr’s model of object perception:
1. grey level rep
2. primal sketch
3. 2.5D sketch
4. 3D sketch
4
Q
Marr and hildreth model
A
- algorithm assumes edges of an object coincide with gradients in luminance
- need to transform this info to make edge explicit so would take the first derivative (expression that tells us the slope of a tangent line to the curve at any instant - shows if there are peaks and valleys of sharp luminance change) telling us rate of signal change. if first der is pos then must be inc, if neg then must dec. shows peaks/valleys where there is an intensity gradient
- need to determine which peak/valley you are interested in. finding them is difficult and how high/low exactly?
- taking the decond derivative gives us zero crossings where there is a luminance gradient (tells us rate of change in signal as move from L to R) where there is a luminance gradient in original signal get a zero crossing.
- edge is represented by zero crossing in second derivative
- this process will be negatively affected by high freq noise in image = may see zero crossings where there is no meaningful gradient so would need a smoothing process - blur it then do second derivative (blurs and removes high freq content) expressed as guassain operator G.
- the peakness (sigma) of G determines level of blurring. higher G = more blurring to image = better results and more accurate edge detection
- can be sequential process but Marr suggests can do all steps (luminance, smooth, first deriv, second deriv) in parallel with laplacian of Gussain filter - when plotted shows a similar on/off centre shape
- this algorithm gives first steps of edge detection anc can be implemented at bio level in LGN
- so retinal and LGN receptive fields can be cosnidered spatial filters that compute the second derivative of an image
5
Q
rapid edge detection
A
- visual edges are important in low level visual processing
- Paradiso and Nakayama (1991) used temporal mask paradigm - obersvers view luminance target (white disc) followed by annular mask (small white ring) and judge luminance. showed target then followed by mask
- pps report a composite image and report the outside refion looks bright but inside is dark = mask interferes with perception.
- suggests we perceive edges first then fill in image and mask stops this filling in process.
- seems low level visual preception is edge based
6
Q
beyond luminance edges
A
- first order edges = differences detected by luminance gradient
- second order edges = complex edges determined by difs in visual texture (invisible to marr and hildreth model)
7
Q
texture segmentation
A
- Julesz - developed model based on statistics of local conspicuous image features - textons
- based on idea that differences in statistics of certain textons lead to something jumping out at you e.g. lines, line terminations or junctions (T and X junctions)
> effortless segmentation = difs in conscpicuous elements e.g. line crossings
> difficult segmentation = no dif in conscpiuous elements - Northfurft (1985) - noticed similarities between luminance segmentation (first order) and texture segmentation. same thing happens - changing spacing gets same results. seems that texture segmentation is more similar to luminance segmentation dependent on spatial scales
> the effects of critical spatial factors e.g. element size and spacing = similarities to luminance
> texture segmentation for luminance and structure difs might be achieved by evaluation of a gradient.
8
Q
bergen and adelson (1988)
A
- demonstrated that texture segmentation can be enhanced/impaired by adjusting sizes of elements
- foreground jumps out from background due to difs in textons but making L shape shorter means info contained at dif spatial scales becomes more important and makes spatial freq more similar to surround
- evidence against Texton theory - suggests simpler filtering process can account for segmentation. what matters most is the info contained at different spatial scales within foreground relative to background.
- mapping onto neurophysiology?
- Lamme et al used sungle cell recording in V1 and found response enhancement at texture figure and edge = texture perception begins at edge and fills in inwards
- temporal sequence of neural responses revelaed separate processes depending on background edge or foreground.
> initial response 50ms
> local orientation tuning - simple cell response to orientation 58ms
> enhancement of edge 90ms
> enhancement in response in figure 112-123ms - consistent with edge based segmentation process that drives filling in of texture surface
- edge enhancement at 90ms suggests feedback from beyond V1