Attribute Selection and Imbalanced Classes Flashcards

1
Q

What are the two approaches that can be used to select attributes?

A

Filter and Wrapper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Filter approach?

A

Attribute selection is INDEPENDANT of classification algorithm to be applied later.
- Evaluates the quality of candidate attribute
subset without use the target classification
algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Wrapper approach?

A

Attribute selection is TAILORED to the classification to be applied later.
- Quality of candidate attribute is evaluated
based on the classification accuracy (on
training data) of the target classification
algorithm run with attribute subset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the main components of most attribute selection methods?

A

A search method

An evaluation function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the pros and cons of Sequential Attribute Selection Methods?

A

Pros:
- Forwards and backwards are simple to understand and implement.
- Forwards sequential is fast.

Cons:
- Backwards selection is quite slow
- Both are heuristic methods - no guarantee of finding the optimal solution
- Both are greedy (do not cope well with interactions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the approach that can be used to deal with imbalanced classes?

A

Re-sampling techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the different Re-sampling techniques?

A

Under-sampling the majority class
Over-sampling the minority class
Hybrid approaches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does Under-sampling work?

A

Remove randomly chosen instances from majority class. This:
- Throws away a lot of relevant information.
- Reduces the time taken to run the
classification algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does Over-sampling work?

A

Duplicate minority class instances chosen at random or create new synthetic minority class instances. This:
- Avoids the loss of instances associated with
Under-sampling.
- Introduces redundancy or new new
potentially noisy class labels.
- Increases the time taken to run the
classification algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is SMOTE?

A

Synthetic Minority Oversampling TEchnique

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does SMOTE do?

A

Creates synthetic instances along the line joining a minority class instances and some or all of its Nearest Neighbours of the same minority class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly