defining the problem Flashcards
what are Modern Machine Vision Systems?
- they are built on a technique called ‘deep learning’
- are extremely powerful but surprisingly easy to fool
- show image of school bus, ask what does it show, says school bus
- add some ‘noise’ to image, for humans the image has not changed but for machine it will say ostrich
- work in different ways to how human visual system works
- not as robust
why is vision deceptively simple?
- computers are nowhere nearly as good as humans in detecting objects
- vision is an extremely complex problem for your brain/mind to solve
- far more complicated than playing chess
- we have a visual brain – large proportion is dedicated to processing visual information, therefore vision seems simple to us
how does a camera record images?
- camera records pixel values - in case of black and white images
- image composed of pixels which have grey values
- that is all that the camera knows - in a certain part of image there was a certain amount of light
camera vs eye?
- CAMERA: Records light intensity as numbers assigned to sensor locations
- EYE: Records light intensity as neural activation of photo receptors
what is the purpose of a camera vs perception?
- purpose of a camera is to record local light intensities
- purpose of perception is to generate meaningful and adaptive representations of environment
what is the Craik-O’Brien-Cornsweet Effect?
- 2 faces of cube
- top looks darker than bottom one
- block out central line of image with your finger
- then the two faces of cube look the same colour
- we don’t encode that the two cubes are the same colour but the camera will tell you they have the same light intensity
- mind tries to generate meaningful explanation
what process can be used to explain two-tone images?
- through the eyes an object is projected onto the retina forming an image (optics)
- on the basis of this image, the visual system tries to infer which object is out there
- vision is trying to invert this process (this is ambiguous - have to use assumptions and prior knowledge of the world)
what is the information processing paradigm?
- input (stimulus) -> Brain/mind (info processing) -> Output (perception)
- how is input transformed in order to give rise to output
why is vision science transdisciplinary?
- vision science derives concepts and methods from: psychology, neuroscience etc.
- levels of analysis: from neurons to percepts
what are psychophysics?
- tries to determine the relationship between stimulus and perception quantitatively
- psychophysicists typically measure thresholds: absolute threshold and difference threshold
- typically, measures limits of perceptual system
- studies perception at the level of the whole organism
- often use stripey stimuli
what is an absolute threshold?
the smallest amount of stimulation that can be reliably detected
how do we measure the absolute threshold?
- use 2-Alternative-Forced-Choice task to measure this
- have 2 stimuli
- one has stripey pattern, one is grey
- ask which one has stripey stimuli
- process is super simple as stimuli have lots of difference
- add contrast and pp is JUST able to see difference as lines are less distinct
- see smooth curve from where something is visible or not
- left graph: hard threshold
- right graph: soft threshold
- stimulus intensity required for pp to recognise stimulus
what is a difference threshold?
the smallest difference between two stimuli that can be reliably detected
how do we measure the difference threshold?
- use 2-Alternative-Forced-Choice task to measure this
- both contain stimulus
- one is reference/standard - at fixed stimulus intensity
- which has higher contrast?
- then change contrast of comparison photo until you get to point where pp is JUST able to see difference between two
- we get soft thresholds again
- amount of difference you need between stimuli to reach certain performance level
what is the Weber fraction?
- one hand we have weight of 100g
- what weight do we need in the other hand to feel a difference between the two?
- let’s say 110g to perceive the JND (which is 10g)
- increase reference weight to 200g, JND is still 10g so in other hand we would need 210g
- ratio between JND and reference intensity is constant
- this constant is called Weber fraction