Week 4 - 3D Computer Vision Flashcards
(32 cards)
What is the fundamental problem with trying to recover the 3D structure of the scene from points matched between two images?
Fundamental ambiguity - Any point on the ray OP projects to the same image location, called P.
What is Stereo Correspondence?
Find matching pixels/features in 2 or more images and convert their 2D positions into 3D depths
What can resolve the fundamental ambiguity in stereo correspondence?
A second camera can resolve the ambiguity enabling measurement via triangulation
How do you achieve depth recovery using two cameras?
You use triangulation, which requires:
- Knowledge of absolute and relative camera geometry i.e. Calibration
- Point correspondence i.e. which rays to intersect
What are the properties of camera calibration?
It recovers the intrinsic parameters of the cameras e.g. focal length, pixel size, principal point, lens distortion
Relative poses between cameras, also called extrinsic parameters, are also factored in e.g. rotation, translation, scale that transforms left image on to right
What is the easiest way to perform camera calibration?
Simplest approach is to use a known calibration target object
What additional geometric distortions are present within lenses used in cameras?
- Decentering errors: Displacement of the lens centre from optical axes
- Radial distortion: Variations in light refractions, mostly in wide angle lenses
What is the image warping parameter?
Image warping parameter is estimated to warp the ideal projected coordinate to the distorted coordinate. K contains warping parameter.
What is the equation for image warping in stereo correspondence?
x’ = warp(x, k)
Where:
- x = Ideal image (no distortion)
- x’ = Observed image with distortion
How are points generally defined in 3D space in stereo correspondence?
Points in the 3D space are expressed in terms of a different coordinate frame known as the world coordinate frame. The relation given between the coordinates of P in camera and world coordinate system are given by: Xcam = R(Xw - c)
Where:
- c = 3x1 vector representing the coordinates of the camera in the world coordinate system
- R = 3x3 matrix representing the orientation of the camera
What is the purpose of camera calibration in a mathematical sense?
It is to calculate the intrinsic, extrinsic and distortion parameters.
What is epipolar geometry?
Given the two optical centres and a point in one image, we can compute the epipolar plane and so the corresponding epipolar line in the other image
Why is epipolar geometry important for camera calibration?
Given two calibrated cameras, it’s possible to retrieve the actual 3D coordinate of a corner in the image
How can correspondence be used to measure depth of an object in an image?
Correspondence allows measurement of disparity: The difference in the image coordinates of the projections of a given world point into each camera.
Depth is inversely proportional to disparity.
How does correspondence search work?
- Find a window in the original image
- Slide it along the right scanline and compare the content of that window with that of the reference window in the original image
What is the effect of a window size in correspondence search?
- Larger window size: Smooth disparity maps but less detail captured
- Smaller window size: More detail, but also more noise captured
What are some problems with correlation-based stereo?
- Window size is fixed across the image, but viewed objects differ in size and depth
- Uniform regions always match
- Can provide a dense disparity map, but values are only reliable where there is some local variation in intensity e.g. near edges
- Dense disparity is computationally expensive in spatial domain
How do you get ground truth data?
- Alternative/competing sensors
- Artificial images
- Real images
What are some problems that can be encountered when gathering ground truth data?
- Automatic methods can have errors
- Manual methods are slow, subjective and also error prone
- What if standard sets don’t have the properties you are attempting to evaluate your images on?
What is True Positive defined as?
True Positive - The algorithm makes a correct prediction about the presence of an object in an image
What is False Positive defined as?
The algorithm predicts the presence of an object but that object is not present in the image
What is False Negative defined as?
The algorithm misses an object
What is Precision’s equation, and how is it defined?
Precision = TP/(TP + FP)
Fraction of responses that were correct
What is Recall’s equation, and how is it defined?
Recall = TP/(TP + FN)
Fraction of correct classifications that were identified