Week 3: Videos Flashcards

1
Q

Describe the purpose of ‘Optical flow’ in the context of video classification.

A

Optical flow refers to the apparent motion of pixels in a video, which is sometimes used as an input feature for video classification with CNNs, finding applications in tracking, video recognition, and motion magnification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the different models used in video classification?

A
  1. Single-Frame CNN: Uses 2D CNNs independently on video frames, assuming average predicted probabilities.
  2. Late Fusion by MLP: Fuses information from multiple frames using a Multilayer Perceptron.
  3. Late Fusion by Pooling: Combines temporal and spatial information through pooling operations.
  4. 3D CNN: Captures spatial and temporal information using 3D convolutional layers.
  5. Optical Flow-Based Methods: Uses optical flow to capture motion alongside raw pixel data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the major 6 steps involved in human pose recognition?

A
  1. Data Collection: Gather datasets containing human images with annotated poses.
  2. Preprocessing: Normalize images, handle noise, and perform image augmentation.
  3. Feature Extraction: Identify key points or joints on the human body.
  4. Model Training: Utilize machine learning or deep learning models for pose estimation.
  5. Evaluation: Assess model performance on test data using metrics like PCK (Percentage of Correct Keypoints).
  6. Real-time Deployment: Implement the trained model for live pose recognition applications.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does “association” refer to in the context of pedestrian tracking?

A

Association in pedestrian tracking refers to linking or associating detections or observations across frames to identify and maintain the same pedestrian’s track over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the Linear Assignment Problem in tracking, and why is it relevant?

A

The Linear Assignment Problem involves assigning detections from one frame to potential tracks in the next frame, optimizing a cost matrix. It aims to minimize the total assignment cost, ensuring correct associations between detections and tracks, crucial for maintaining accurate object tracks in multi-object tracking scenarios.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does applying a pre-trained Mask R-CNN relate to a simple tracking algorithm

A

Applying a pre-trained Mask R-CNN within a tracking algorithm helps in detecting and segmenting pedestrians in frames, providing valuable information for associating detections across frames and forming robust tracks for pedestrian tracking tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is video?

A

A sequence of images (frames) with 3 dimensions: width, height, and time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the challenges of video processing?

A

Size, complexity of capturing temporal relationships between frames, and computational cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the three main approaches to video classification?

A

Single frame, late fusion, and early fusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the advantages of 3D convolutional neural networks (C3D) for video classification?

A

They operate directly on the 3D video data and can capture temporal relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is optical flow?

A

A measure of pixel motion between frames, providing information about movement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are key joint locations in human pose estimation?

A

Landmarks on the body such as elbows, wrists, and knees.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of synthetic depth-image datasets for human pose recognition?

A

To provide variations in appearance and pose for training models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the advantages of decision tree classifiers for human pose recognition?

A

They are fast, efficient, and can handle multiple body parts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is OpenPifPaf?

A

A software tool for human pose tracking.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly