high level perception additional reading Flashcards
(9 cards)
1
Q
problem of shadow
A
- Wang& Yagi, 2014: they propose a method treats shadows as helpful cues rather than noise, combining shadow, motion, and appearance features to improve pedestrian detection accuracy in outdoor environments
2
Q
problem of variation
A
- Tanaka & Fujita (2015)
- neurons in visual area V4 adjust their response based on object distance, helping maintain size constancy—objects appear the same size even as their retinal image (in degrees of visual angle) changes with distance.
3
Q
problem of occlusion
A
- Saleh et a;., 2021
- review how occlusion negatively affects object detection and classify existing methods for handling occlusions in various environments, emphasizing the need for more robust, adaptable solutions
4
Q
gestalt laws of perceptual organisation strengths
A
- Palmer (1992): empirical support in visual perception. Numerous experimental studies support Gestalt grouping. For example, Palmer (1992) found that perceptual grouping by proximity and similarity occurs rapidly and automatically in visual processing
- Wagemans et al., 2012: Gestalt principles have inspired models in artificial intelligence and computer vision, especially for image segmentation and object recognition. For example, the principle of common fate has been implemented in motion-based grouping algorithms
5
Q
gestalt laws of perceptual organisation limitations
A
- Pomerantz, 1981: The Gestalt laws often lack clear definitions or mathematical formalizations, making them difficult to test rigorously or predict outcomes precisely in novel scenarios
- Rock & Palmer, 1990: One of the primary criticisms is that the Gestalt laws are descriptive, not explanatory. They tell us what we perceive but not why or how these perceptual groupings occur in the brain
6
Q
(Biederman, 1987) strengths
A
- Biederman & Gerhardstein, 1993: claims recognition is generally viewpoint-invariant, which aligns with findings from human vision studies (Biederman & Gerhardstein, 1993). This is a key advantage over earlier theories that required exact image matching
- The theory provides a parsimonious explanation for object recognition by suggesting that only around 36 geons are needed to account for recognition of thousands of objects, simplifying the representational demands on the brain
7
Q
(Biederman, 1987) limitations
A
- Tarr & Bulthoff, 1995: While RBC claims viewpoint invariance, later studies showed that recognition can be viewpoint-dependent, particularly for novel or complex objects (Tarr & Bülthoff, 1995). This suggests that RBC may oversimplify how the brain processes 3D shape.
- Tanaka et al., 2021: RBC focuses on shape and structure but largely ignores texture, color, and material properties, which are known to influence recognition, especially in naturalistic settings
8
Q
Marr 1982 strengths
A
- Marr & Poggio, 1976: Marr’s theory was one of the first to define vision as a computational problem, distinguishing between three levels of analysis: computational theory, algorithm, and implementation. This framework has since become foundational in cognitive science and artificial intelligence
- Biederman, 1987: Marr’s model influenced structural description theories such as Biederman’s Recognition-by-Components theory, which also emphasizes part-based representation and object-centered descriptions
9
Q
Marr 1982 limitations
A
- Tarr & Pinker, 1989: Marr’s emphasis on an object-centered 3D representation has been criticized for lacking flexibility. Empirical evidence (e.g., Tarr & Pinker, 1989) suggests that human object recognition is often view-dependent, relying on stored views rather than canonical 3D models
- Kriegeskorte, 2015: Marr’s model largely assumed that vision operates via hardwired processing. Later models in cognitive neuroscience and machine learning (e.g., deep neural networks) emphasize learning, experience, and adaptation, aspects not fully addressed by Marr.