Perception and action Flashcards
What was Berkely’s (1709) insight to how depth perception is ambiguous?
Distance, of itself invisible (Berkeley 1709)
Each ray of light projected onto the retina could have originated from an infinite number of points along its path.
There is no information intrinsic to the image associated with the ray that coveys the distance of its source
Across a whole set of points projected onto the retina, there is an infinite number of possible objects or scenes from which this could have arised from.
What evidence existed prior to Gibson’s regarding direct/ indirect perception?
Direct sources of depth and distance, accommodation and vergence, may provide estimates over short distances.
Binocular disparity may provide information about depth but needs to be scaled for viewing distance in order to be interpreted correctly.
It is not obvious how this can be achieved over anything but the nearest parts of the scene.
Our estimates of depth and distance are supplemented by cues and clues which Berkeley suggested arose from our ability to associate them with distance through learning.
These associations then represent heuristics which the visual system relies on to interpret the scene.
What is the issue of linear perspective?
Difficulties arise with this as the information conveyed by pictorial cues can be ambiguous.
Lines of two completely different objects in a scene such as the turret and road in magritte’s work are of identical length and angle and could equally well represent converging lines in the vertical plane or parallel lines receding into the distance in the horizontal plane
What is meant by indirect perception?
Based on these considerations there is a school of psychology that believes that perception is indirect.
There isn’t sufficient information in the retinal image to specify unambiguously the structure of the scene that gave rise to it
The visual system must rely on cues and clues, multiple sources of incomplete information, in order to arrive at the best possible interpretation of the scene that is consistent with the information available and our prior knowledge.
This happens through a process called unconscious inference (Helmholz), or Hypothesis testing (Gregory), or Intelligent-thought-like-processes (Irving Rock)
What is meant by direct perception?
James Gibson led an alternative school of philosophy that believes perception is direct.
Argued there is no need to make inferences in perception as there is more than enough information available to interpret the image, and this information is acquired directly.
Replace classical approach with one that emphasises the perception of surfaces in the environment
Ground consists of surfaces at different distances and slants composed of textural elements
Need an appropriate ecological geometry to describe the environment
It is the structure in the light rather than stimulation by light that furnishes information for visual perception
Rejected the claim that the retinal image is the starting point for visual processing
The whole array of light rays reaching an observer after structuring by surfaces and objects in the world, provide direct information about the layout of those surfaces and objects and about movement within the worlds and by the observer
What is meant by optic flow?
Those studying static ray diagrams tend to overlook one of the richest sources of information, found in the changing image of the moving observer: optic flow.
Optic flow is the change in the pattern of light reflecting on the retina associated with movement of the observer through the scene or movement of parts of the scene relative to the observer.
It provides powerful cues about depth and distance in the scene as well as the 3D shapes and objects within it.
consists of an infinite collection of light rays of different wavelengths and intensities emitted from different sources in the scene. The rays form a hierarchical and overlapping set of solid angles corresponding to the boundaries of objects. Changes to the pattern or properties of the light from one solid angel to another signal boundaries in the world where one object may partially occlude another
Textural surfaces structure the light reflected off them - the resulting array could inform an observer about the shapes and orientations of those surfaces
Varients in information in the optic array are produced by movements of the observer and in the motion of objects in the world
What is motion parallax?
Experience having looked out the window of a travelling train or car.
As one looks out, the gradient of motion factors is largest at the observer and corresponds to the observer’s speed, and then recedes towards the distance.
The speed and direction of flow is relative to the fixation point.
What do expansion and contraction tell us?
The pattern of optic flow associated with moving forwards and backwards through the environment.
Motion vectors spread outwards when moving forwards or when an object looms towards an observer and they move inwards when moving backwards or the object recedes.
The focus of expansion or contraction provides a cue for the direction of motion.
What evidence is there from the kinetic depth effect?
Movement can also provide a powerful cue for inferring shape
When we perceive a pattern of seemingly random lines, setting them in motion enables us to recognise them as edges of a complex 3D shape projected onto a flat surface.
Can easily recover the third dimension to perceive the original shape.
Underlies our ability to interpret scenery and objects depicted in computer games and movies.
What evidence is there from the sterokinetic effect?
Seemingly flat black and white patterns change shape into a solid hollow cone when rotated
Stereo = solid
What evidence is there from random dot kinematograms?
From setting white dots on a black background in motion we can see that they represent a pair of hollow rotating cylinders, purely from the motion vectors depicted by the dots.
As long as they move we can see the cylinders in 3D, when they stop we can only see the dots on the surfaces of those cylinders projected as a flat 2D plane.
What can can be concluded from demonstrations like random dot kinematograms?
Motion adds so much more information to an image
It appears to be very easy for us to extract the third dimension from motion
We can infer shape and action from relatively sparse sets of motion vectors
What is exterospecifc information?
information such as depth, distance and shape that relates to the layout of the scene and the objects within it
What is propriospecifc information?
information about our own movements, which is essential for guiding our actions through the environment.
what is expropriospecifc information?
the binary conception of exterospecifc and propriospecifc informatoin obscures the fact that animals interact with their environment so they need expropriospecifc information to adequately control action
that referring to the position, orientation and movements of the animal’s body relative to its environment (link to idea of affordances).
Lee’s classical swinging room experiments illustrate the distinctions between these types of information and the importance of optic flow over mechanoreceptors and the vestibular system in controlling body sway or balance.
How has knowledge about optic flow changed thought concerning the maintenance of balance?
Maintaining stance and balance is a fundamental motor skill requiring expropriospecifc information about the orientation and sway of the body relative to its environment.
Traditionally it was thought that we rely on proprioception signals from stretch and pressure receptors from the muscles in the feet and inner ankles, as well as vestibular information in the inner ear.
What evidence is there from David Lee’s (1974) swinging room experiments?
David Lee demonstrated the importance of optic flow using the swinging room - a large box with no floor suspended from the a track on the ceiling so that it could be moved back and forth/
The walls of the room are covered with texture so that when it moved it created the same pattern of optic flow as would be observed by the observer if they were swaying.
Observer is asked to stand in the box and the box is moved slightly. The observer compensates for their perceived swaying by swaying in the same direction as the box.
An expanding flow field would normally suggest that the observer was tilting forwards so they would compensate by swaying backwards.
By moving the room back and forth the observes could be induced backward and forward, unconsciously, to such an extent that Lee described them as being hooked like puppets
What is the trolley variation of Lee’s studies in the 1970s
Lee asked the observer to push a trolley which was mechanically couple to the swinging room so that as the observer walked forward the wall facing them moved away faster than when they stepped towards it and when they walked backwards the wall facing them moved towards them faster than they stepped away from it.
When walking forward the observer reported that they were moving backwards which is compatible with the optic flow pattern generated by the receding wall and vice versa.
How does the performance of Toddlers differ in Lee’s experiments?
The effect is more pronounced in toddlers who will typically fall over backwards if the wall moves towards them and forwards if the wall moves away and performance is worse in children than adults up to the age of ten which suggests that the visual system trains and calibrates the motor system
How did Redfern and Frueman (1994) demonstrate the interdependence of the vestibular and visual systems in balance?
adults whose vestibular systems of the inner ear have been affected by disease in later life as their vision regains the role of controlling posture sway more strongly than controls
suggests that the visual system trains and calibrates the motor system
the optic flow is still more influential than the motor system because all individuals tested, including healthy adults, exhibit the same sway response to the moving wall
The control of posture involves the integration of mechanical and visual information and adapts to changes caused by growth or sensory loss but is dominated by information from optic flow.
How does vision control action?
The basic function of vision is to obtain information for controlling activity which goes on at an unconscious level.
Must control balance and body sway which is primarily controlled by vision.
Brain takes visual information more seriously than information from muscles and balance mechanisms.
Toddlers fall over due to the movement of walls but the ground is completely still.
The optic flow patterns generated by these walls were relatively subtle and the observes were unaware that anything was happening when the walls moved.
If the extent or speed of the room is increased, we feel as if we’re moving.
Pitted optic flow information against proprioceptive and vestibular information and found that optic flow completely dominates in the perception of self motion.
Also suggests that the dominance of vision could be useful for tuning up or calibrating the other senses
What is Gibson’s theory?
The purpose of vision is not for producing visual representations or internal images of objects, but rather guiding action.
The starting point is not the retinal image, but the optic array - the spatial pattern of light rays impinging on the eye from all surfaces in the scene.
Movement causes transformations in the optic array that are meaningfully related to changes in the relative positions of light sources, surfaces and the observer and produce optic flow.
The process of vision involves identifying invariance which are things that remain constant despite the transforming optic array.
E.g the size of an object changes with distance from an observer but the relative sizes of two objects stays the same
Perception is direct because the transforming optic array can unambiguously specify the properties of the surrounding world.
There is no need for cognitive processes to interpret the images.
The end point of perception is not internal representations or conscious percepts, but affordances - what, in the way of interaction, an object offers an observer.
E.g a handle affords graspability
Make sense in the context of action implying movement and movement implying optic flow and decisions about how to ac
What is Lee’s time-to-contact model?
Judging distance is crucial for 3D perception
Not clear that we are actually capable of judging distance over any but the shortest distances without resorting to potentially unreliable pictorial cues.
Lee argued that if the purpose of perception is to guide action then knowing when to act might be more useful than knowing how far away an object is.
E.g when driving we apply the breaks based on how soon we are about to hit an object rather than a set braking distance
Possible to estimate time to contact directly without having to access distance or to rely on information other than that which is directly available to the visual system
Time to contact = distance of object (ZT)/ speed (VT) =Rt/ vt
Vt can be measured directly by the retina without having to rely on any information about the distance (zt)
As long as we only need to know when to act so as to interact with a moving object, no information need be added to that already available in the retinal image.
The visual system only needs to read out the speed of expansion of the optic flow field at any given eccentricity and look up time to contact.
Can all be implemented using place coding.
If the visual system can obtain τ (tau) from an expanding retinal image then information about time to contact with a surface is directly available to the brain to compute when to time actions.
What are real life examples of time to contact?
long-jump
breaking