3.1 Discrete and continuous data Flashcards

1
Q

What is probability density?

A

Probability density is the relationship between observations and their probability.

In probability theory, a probability density function, or density of a continuous random variable, is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample.

the values of Probability Density Functions are not itself probabilities because they need first to be integrated over the given range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Kernel Density Estimation?

A

It is a technique to estimate the unknown probability distribution of a random variable, based on a sample of points taken from that distribution. The result is often a smooth curve where the smoothness is dependant on the bandwidth parameter h

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

One-hot encoding converts a nominal attribute with D levels into how many boolean attributes?

A

In one-hot encoding, the nominal variables are converted into vectors of length D, or D boolean variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which of these discretisation methods does not require you to choose the number of bins?

A

Most discretisation methods require the user to choose the number of bins. However, supervised discretisation places boundaries between bins wherever there is a change in class label, so it does not require the user to specify the number of bins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What must be assumed when doing kernel density estimation?

A

The bandwidth of the kernel (= standard deviation, for a Gaussian KDE) must be assumed, and different values can produce different results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which naive Bayes methods are designed for continuous attributes? (Select all that are correct)

A

Two options for continuous data are Gaussian and kernel density estimation (KDE) naive Bayes.

Gaussian naive Bayes assumes that continuous attributes have a Gaussian distribution

KDE learns the distribution from the data. The other types of naive Bayes assume nominal variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Discretisation, and where might it be used?

A

We have a (continuous) numeric attribute, but we wish to have a nominal (or ordinal) attribute. Some
learners (like Decision Trees) generally work better with nominal attributes; some datasets inherently have
groupings of values, where treating them as an equivalent might make it easier to discern underlying
patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Summarise some approaches to supervised discretisation?

A

Our general idea here is to sort the possible values, and create a nominal value for a region where most
of the instances having the same label

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does equal width binning work?

A

The algorithm divides the data into k intervals of equal size. The width of intervals is:

w = (max-min)/k

And the interval boundaries are:

min+w, min+2w, … , min+(k-1)w

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does equal frequency binning work?

A

The algorithm divides the data into k groups which each group contains approximately same number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly