Nick-high Level Perception Part 1 Flashcards
(29 cards)
Gestalt psychology background
We can if objects even when they look diff as whole is more than sum of the parts. Group things by similarity, proximity, closure (close shapes w lines, see as separate), good continuation (no broken lines), common fate (move together) figure ground is area bound by contour but only belong to one group at a time e.g. visual illusion either vase or faces
Marrs model of recognition levels
Computational. 1 is primal sketch, initial idea of what things look like, 2d rep of luminance which allows edges, goes from raw to full. 2 is 2 1/2 D sketch: depth, orientation,shading, binocular disparity and motion parallax, texture but viewpoint dependent (only from one view), tells you object is 3D. 3D model has invariant viewpoint
Marrs model stages in practice, hubel and Wiesel
Analyse image w range of edge filter, simple sells respond to spec. Then use gestalt grouping to find outline by id concavities (dips). Define arrangement of parts in terms of cylinders starting w principle axis then smaller cylinders (body first then limbs, fingers). Match to 3D models in memory of models of cylinders for objects. Says diff orientations are easy to if but w: can’t id objects upside down or tilted toward you
Biederman
Recognition by components model. Edge extraction non accidental properties like curve, symmetry, constant across views. ( surface characteristics, edges which don’t alter w view like parallel lines, symmetry). Then segment components by concavities but the different parts called GEON, max 36 set no. Shapes that can make up any object, then match to memory. W: doesn’t differentiate objects within group, no surface patterns, viewpoint dependent id
Object processing pathways - ungerleider and mishkin 82
Used monkey brains, v1 for visual processing flows down to temp and up to parietal. Monkeys w no temporal can’t id new objects so what, no parietal can’t remember location so where
Object agnostic
Damage to temporal, can’t id objects but no loss of intelligence of visual, can see edges and can draw but not id. Goodale and milner 91: df carbon monoxide, couldn’t tell size, shape, indicate size w hands or orientation but could reach for it and post it. Couldn’t match it (hold card up to match position of slot)
Object perception for recognition vs action - aglioti
Titchner circles illusion (circle looks smaller surrounded by big and bigger when small but equal), did 3D version and found perception influenced by grip width/aperture was correct so diff systems
Optic ataxia
Damage to dorsal stream in parietal. Can’t reach or position fingers. Can’t grip correctly for object. So ventral is what, dorsal is where /how
Summary of diff pathways
Early models say vision is to construct internal reality for thought and action but in 80s shift to focus on what vision does for us. Id of an object is object centred, not viewpoint, action is viewer centred as act from one view. What are where interact-lateral occipital cortex has identity reps and location irrespective of space
The different neurones- hubel and Wiesel and the order
Found simple cells that respond to bars in specific orientations in their receptive fields. Give input to complex cells who respond to bars anywhere in rf, give input to hyper complex (has end stopping)- all in v1. V1 edges, v2 has contours not there for illusion of shapes, v4 has colour and shape, pit has simple features and air has elaborate spec objects
Inferotemporal cortex
Has cells for shape, colour and texture. Responds to all object w these properties, generalise across populations, rf can be big or small. Organised into columns in cortex. More posterior has more orientation and size, more anterioir less so.
Hierarchical model of object rec- the most accurate
Input image decided by simple cells that respond to specific lines, then complex cells in v2 respond to angles and contours. Then posterior temporal cortex/pit have selective response to image as whole and does this based on learning/plasticity.
Operation green and blue- hierarchical model and updated
Operation green: v2 responds to cells from v1 responding the most, respond wherever the line occurs in space (max operation), provides invariance. Blue operation (pooling): v2 cells need spec orientations to occur together, weighted sum, template matching, increases pattern selectivity. Updated model: riesenhuber and poggio: image goes to view based model or object then pfc
Eval of hierarchical model
It is anatom and physio possible, based on connections and brain cells. Based on earlier models. Cpoes w viewpoint dependence and indep. Has theories of learning. Copes w multiple objects and things in diff contexts
Bottom up processing vs top down
Bottom up is the hierarchical. Have stim and letters detected by low level detectors e.g. simple cells, then mid level pattern detectors which excitates the correct neurones and inhibits others, then high level for the certain words (also ex/in) Top down uses context to guess what’s going to be said from memorised concepts, activated high level object detectors first then mid.
Combination of top down and bottom up
Both used when see ambiguous image but given context, memorised concepts come down and nuclear text comes up, processing bidirectional. Expectations lowers the threshold for likely terms. In brain, more connections that go down than up from cortex to lgn and eye.
Role of context
Easier to find object in normal setting than jumbled. Word superiority effect is detecting letter easier when in word. Infants search for hidden objects and surprised when object doesn’t reappear from screen over 6m.
Object representation
When two objects perceived as being similar, have similar neural representations. Cichy 2019: asked ps to judge similarity on shape, function, colour, background and then free sort. Free sort most similar to function. Looked at psych similarity matrices in brain and perceived similarity related to ventral visual cortex activity. Get reps at 200ms, colour first then shape
AI
Either supervised learning: train w known objects, show and say this is phone, need multi million data sets e.f. Convolutional neural networks, comparable to humans, pools info through deep neural network to make estimate. Unsupervised is don’t tell it anything and halfway. Predict perceived similarity like humans and show spatial invariance (doesn’t matter where, cluttered) can’t overfit, layers need to be interpretable and handle irregular structures,
Faces in the brain background
In the anterioir temporal cortex, cells selective to faces- also orbitofrontal and ventrolateral pfc . Gross-first accidental discovery. Foldiak- recorded cells in temp lobe, found face spec by showing thousands of images to monkeys. Electrodes produce Rasta gram with dots=ap, this creates a perihistim time histogram, which is an average to get peaks. Cell respond to image if face, not based on visual characteristics
Grandmother cell and issues
Idea cell responds to only one identity, due to hierarchical processing e.g. faces, women, fam then grandmother only. Issues are not enough cells, person can only recognise 20000 words, means cell death would mean loss of recognition for grandmother, population coding happens which is multiple cells firing, can’t test as would have to present one cell everything
Facial features - thatcher illusion
Thatcher illusion- when face upside down, can’t tell which way up the facial features are but obvious when right way. Means upright processed differently as features analysed holistically and when inverted, have to use serial as gone to see if right way or not
External and internal face features
Internal are eyes, mouth, nose and external are things like head and hair. Need to be in right place to recognise a face- tiny changes mean you can’t. Use external to recognise unfamiliar ppl but internal for well known. Perrett: cells respond most to whole face then just eyes but no firing when shown face w no eyes. Other cells respond to whole face w no eyes e.g have internal spec and external spec. Other cells must have all features to fire
Face identity - adaptation
Learnt as recog faces of own races than others as spend more time. Evidence for short term face adaptation e.g. stare at face, become used to it and only see the different faces when illusion of morphed faces. Rhodes: adapted to white or Asian faces, ps better at discrim faces of race adapted to. Reduced response of cells that signal common properties so only signal diffs