Perception 1: Low level vision Flashcards

Question

Why is there a smoothing process in edge detection?

Answer 1

This process will be negatively affected by high frequency noise - zero crossings where no meaningful gradient exists - detect edges where they are not actually present

Answer 2

- Equivalent to first blurring the image (pre-derivatives) - Equivalent to removing high frequency content (i.e. analysis at coarse spatial scale) Expressed as convolving the image with a Gaussian operator G - each pixel is blurred with its neighbours - the sigma of G determines the level of blurring Higher level of sigma = more blurry image More blurry = more accurate edge detection, as edge detection is not corrupted by random noise

Answer 3

No This can be done in all one step: Simple filtering operation - convolving the original image with a Laplacian of Gaussian filter achieves the same steps in one operation

Answer 4

2D model of it - rotationally invariant When this is plotted top-down It represents a retinal or LGN receptive field!

Answer 5

Spatial scale

Answer 6

A range of sizes so that the full range of scales and spatial frequencies are sampled There is a trade-off between noise removal (better at coarser scales) and edge enhancement (better at finer scales)

Answer 7

“If a zero-crossing segment is present in a set of independent channels (scales) over a continuous range of sizes… then the set of such zero crossing segments may be taken to indicate the presence of an intensity change in the image that is due to single physical phenomenon (a change in reflectance, illumination, depth or surface orientation)”. Marr and Hildreth

Answer 8

Coarse scale = where edges are Finer scale = if edges in same place, edge information is meaningful Can use these to more accurately detect position of image Combining info from different spatial scales to get the trade off and best of both Retinal and LGN receptive fields are spatial filters that compute the second derivative of an image

Answer 9

Location of zero crossing is represented in both on and off centre cells Both cells excite simple cell - simple cell responds when both cells give a positive response Combine info to localise presence of edge - zero crossing

Answer 10

A target (white circle with black background) is presented for 16ms A mask (white ring on black background (smaller than white circle) - inner bit of circle is black) is then presented for 16ms Observer sees composite image of both target and mask Observers rate the brightness of the outside high and brightness of the inside dark Shows that observers see edges first, then rest of image is filled in - filling in process was interrupted by mask This shows that low level visual perception is mostly edge based

Answer 11

First order - defined by luminance gradient Second order - no luminance difference, defined by texture

Answer 12

Julesz (1981) developed a model based on statistics of local conspicuous image features - textons Based on the simple idea that differences in the statistics of certain kinds of “conspicuous” features means certain features jump out at you - Oriented lines - Line terminations - Junctions (T and X junctions) Based on feature detection theory and preattentive visual search

Answer 13

Effortless - difference in conspicuous element (e.g. line crossings) Difficult - no difference in conspicuous elements

Answer 14

Noticed similarities between luminance segmentation and texture segmentation Luminance segmentation becomes more difficult as element spacing increases - when textons are more spaced out, it is harder to detect a shape among the textons Even when features of textons are kept the same, segmentation ability can vary considerably Instead, these results suggest a mechanism that is more similar to an edge detection mechanism sensitive to spatial scale (i.e. spatial frequency) and orientation

Answer 15

Evaluation of gradient (as for luminance) allows for texture segmentation Closer to Marr and Hildreth theory Making some textons longer = easier Making some textons shorter = harder - This is because it involves changing spatial frequency content of the image Different spatial scales = how we can do texture segmentation Computational models of texture segmentation (or edge detection) now explain this not in terms of textons but in terms of spatial differences in orientation and spatial frequency statistics This brings us back to “Fourier analysis”, in which an image is represented in terms of energy contained within channels tuned to combinations of spatial frequency and orientation

Answer 16

Lamme et al (1999) used single cell recording to record from neurons in V1 in awake macaque monkey Receptive field was mapped and activity was recorded at different positions along the texture Temporal sequence of neural responses revealed separate processes (right figure shows fig – ground response): - Initial response (50 ms) - Local orientation tuning (58 ms) - Enhancement at edge (90 ms) - Enhancement in figure (112 - 123 ms) Consistent with an edge-based segmentation process that drives filling in of texture surface Edge enhancement at 90 ms suggests feedback from beyond V1. Only immediate sensitivity to the edge, with feedback from other visual areas Thus, low level vision is edge detection

Perception 1: Low level vision Flashcards

(40 cards)