How to Select Categorical Input Features: Encoding and K-best Flashcards

1
Q

DOES PANDAS TRY TO MAP SOME STR INPUTS TO NUMERICAL VALUES IN THE DATASET? IF YES, WHAT SHOULD WE DO ABOUT IT?P137

A

Yes, that’s why it’s better to convert for example, numbers with str format, into str, after reading the dataset file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

DOES THE ORDINALENCODER IN SCIKIT-LEARN ALLOW SPECIFYING THE ORDER OF CATEGORIES? P139

A

Yes, it does.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

WHAT IS THE DIFFERENCE BETWEEN ORDINALENCODER AND LABELENCODER? P139

A

Labelencoder is for encoding a single variable, can’t give a dataframe to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

WHEN USING K-BEST FOR CHI2 TEST, WHAT DO HIGHER SCORES INDICATE? P141

A

Stronger dependence between feature and target (it doesn’t show p-value, it’s just a score)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

HOW CAN WE USE MUTUAL INFO IN K-BEST FOR CLASSIFICATION PROBLEMS? P152

A

By setting “score_func” parameter to mutual_info_classif

How well did you know this?
1
Not at all
2
3
4
5
Perfectly