Audio Analysis and Assessment Flashcards

(298 cards)

1
Q

Audio represents…

A

sound pressure changes over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sound is converted into what, by a transducer?

A

Voltage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Audio can be … or …

A

Continuous or Discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Continuous signal represents…

A

Real world sound pressure variations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

An example of continuous signal equipment is…

A

Analogue Equipment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Discrete signal represents…

A

sound as a series of ones and zeros

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ADC stands for…

A

Analogue to Digital Converter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sampling frequency is found on what axis?

A

X-axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sampling frequency is…

A

Number of samples taken per second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sampling frequency is measured in…

A

Hertz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Amplitude quantisation is found on what axis?

A

Y-axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Amplitude quantisation is a…

A

Binary Encoding Scheme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In amplitude quantisation, the number of bits dictate…

A

The number of levels we can represent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the amplitude quantisation bit equation?

A

2^n
n = the number of bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are three examples of digital audio files?

A

WAV, AIFF, AU

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In digital audio files, the Y-axis could represent? (4)

A
  1. Normalised (-1 - 1)
  2. Sample Value
  3. dB
  4. Percentage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In digital audio files, the X-axis could represent? (2)

A
  1. Time
  2. Samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does PCM stand for?

A

Pulse Code Modulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Aspects of the PCM encoding process directly affect…

A

Signal quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Digital data can only represent…

A

A finite set of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Digital data’s finite set of values are set by…

A

The number of bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is digital value?

A

The nearest approximation of the analogue signal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Approximation introduces what to each sample?

A

Quantisation error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is quantisation error?

A

It is the difference between the analogue input signal and the quantised level assigned by the encoder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Quantisation =
Approximation
26
When is the maximum quantisation error reached?
At a half step
27
What does quantisation error create?
Quantisation Noise
28
What does SNR stand for?
Sound to Noise Ratio
29
As SNR increases, noise...
Decreases
30
As SNR decreases, the distance between signal and noise...
Decreases
31
What does SQNR stand for?
Sound to Quantisation Noise Ratio
32
What two factors does SQNR have?
1. Number of bits encoding audio 2. Input signal amplitude
33
What is the SQNR equation?
SQNR = (6.02*B)+1.76dB, where B = number of bits (16, 24, etc.)
34
Under what two conditions of the input signal makes quantisation noise similar to white noise?
When signal has large amplitudes or When signal has wide bandwidth
35
What two problems occur when input signal has a low amplitude?
1. Relative magnitude of distortion increases (SQNR decreases) 2. Quantisation noise is correlated with the input signal
36
Whats the difference between quantisation distortion and white noise?
Distortion is more annoying due to its its unpredictability
37
What are the two ways to reduce quantisation noise?
1. Increasing bit depth 2. Dither
38
How does increasing bit depth decrease reduce quantisation noise?
Each additional bit increases SQNR by 6dB (halving QN)
39
Increasing bit depth causes what issue?
Increasing bit depth increases processing burden
40
What is dithering?
Adding noise to signal before sampling to reduce the audible effect of quantisation error
41
As well as reducing the audible effects of quantisation error, dithering does what at low amplitudes?
Randomises quantisation error
42
Why does dither work even though quantisation error can still be audible?
Noise is easier to listen to than distortion so dither helps make audio less annoying
43
Most audio we hear is... (hint - digital files, streaming)
Compressed
44
Noise is created when quantisation depth is manipulated by...
Compression
45
Nyquist frequency is...
Half of sampling rate
46
Signals sampled at discrete intervals have...
An upper limit to frequencies
47
When above Nyquist frequency, there is a period between...
Samples to reproduce the input signal correctly
48
What is aliasing?
When frequencies greater than Nyquist frequency appear as lower frequencies within the spectrum
49
What happens when sampling at twice the highest frequency in the spectrum?
A correct representation of all frequency spectrum
50
Aliasing can be looked from both...
A time and frequency domain perspective
51
Aliasing can be avoided by having at least how many samples per cycle of waveform?
Two
52
When does aliasing occur? (2)
1. When sample rate is too low 2. When signal with twice the sampling frequency is observed by system
53
Aliasing introduces what to audio?
Unwanted frequencies
54
What is the aliasing equation?
Af = Fs - F Fs = sampling frequency F = input frequency
55
Aliasing affects what frequencies?
All frequencies above Nyquist frequency
56
Sampling process is called...
Pulse Code Modulation
57
What occur around carrier frequency?
Sidebands
58
Sidebands occur around carrier if bands arent...
Limited
59
Sidebands make output spectrum...
Complex
60
What is the sideband equation?
(n * Fc) +/- Fm
61
In terms of sidebands, what component of audio is the carrier and what component is the modulator?
Audio signal = Modulator Sampling frequency = Carrier
62
The input signal spectrum forms sidebands around...
Integer multiples of the sampling frequency
63
When do sidebands move closer together (overlap)?
When sampling frequency is less than twice the highest frequency
64
When do sidebands increase in width (overlap)?
When audio signal is greater than Nyquist frequency
65
Anti-aliasing filters remove...
Frequencies above Nyquist frequency
66
1. abs() function is used for measuring... 2. Why?
1. Peak on bipolar waves 2. abs() ignores negative values
67
1. How do we measure dB? 2. If amplitude decreases by half, what is the change in dB?
1. 20log(a/b) 2. -6dB
68
What is the dB change for every bit increased?
6dB
69
What does RMS stand for?
Root Mean Square
70
What does RMS represent?
Distribution of sample values
71
What info does RMS give us?
Average energy/power
72
RMS can be affected by...
Compression
73
What is the crest equation?
Crest = 20log(peak amplitude / RMS)
74
The ratio between peak amplitude and RMS is called...
Crest
75
Crest controls...
Relationship between average energy and peak values
76
What are the equations for frequency and period?
f = 1 / T T = 1 / f
77
Why do audio signals change dynamically over time?
Because amplitude and frequency change
78
What is based on frequency, amplitude and time parameters?
Human hearing response
79
1. Distinguishing separate frequencies throughout audible frequency range isn't... 2. What is the term for the above?
1. Constant 2. Discrimination
80
As well as distinguishing separate frequencies throughout frequency range not being constant, what else is not constant?
Sensitivity
81
Amplitude response has a...
Very large dynamic range
82
What is the threshold of feeling in dB?
120dB
83
Give an example of a non-linear graph.
Fletcher Munson curve
84
The Fletcher Munson curve shows...
Non-linear sensitivity over frequency
85
As frequency increases, resolution...
Decreases
86
Humans find it harder to discriminate ... frequencies.
Higher
87
Log scales and constant Q reflect...
Human perception of frequency/pitch
88
What is constant Q?
Relation of bandwidth
89
As band centre frequency increases, frequency...
Increases
90
As bandwidth increases, frequency...
Increases
91
1. What is the equation for Q? 2. What is heavy cool about this?
1. Q = centre frequency / bandwidth 2. Q will always remain constant
92
What two things are crucial to audio processing operations?
1. Frequency 2. Amplitude
93
What does audio frequency analysis do?
Extract frequency from signal
94
Audio frequency analysis describes...
Frequency and amplitude over time
95
What is the most common approach to extract frequency information?
Fourier Analysis
96
Our boy, Fourier, stated - 'Any periodic function may be represented as...
An infinite series of harmonically related sinusoids'
97
In terms of Fourier, an input signal is a combination of...
Harmonically related sinusoids
98
Why do we want good frequency resolution?
To see down to the individual frequencies
99
Why do we want good time resolution?
To see down to a few milliseconds
100
We can think of Fourier analysis frequency resolution as...
A series of frequency bands or filters
101
1. In Fourier analysis frequency resolution, bands are... 2. Unlike...
1. Spaced linearly 2. Human hearing system
102
Analysis bins refer to...
Bands
103
Frequency resolution is determined by...
The number of samples of the input signal
104
Close spaced frequencies separate when...
Filters narrow
105
To increase accuracy, we can increase what three things?
1. Transform 2. Samples 3. Frequency Resolution
106
What is the bin bandwidth equation?
Band bandwidth = Fs / length of transform (in samples)
107
What is the bin centre frequency equation?
Bin centre frequency = n * bin bandwidth
108
What is the length of transform equation?
Length of transform = Fs * t (seconds)
109
What is the window duration equation?
Window duration = number of samples * sample period
110
What is the sample period equation?
Sample period = 1 / Fs
111
What problem arises with frequency and time resolution?
1. Good frequency resolution results in bad time resolution 2. Good time resolution results in bad frequency resolution
112
If we analyse a whole track (3 mins), would we have good frequency or good time resolution?
Good frequency resolution
113
If we analyse a short segment (0.1 seconds), would we have good frequency or good time resolution?
Good time resolution
114
Does time resolution or frequency resolution have a smaller computational expense?
Time resolution
115
Fourier analysis is ... on the computer
Strenuous
116
What method is faster than Fourier analysis?
Fast Fourier Transform (FFT)
117
FFT requires transform length to be...
to the power of two (256, 1024, 2048 samples)
118
FFT requires what to be to the power of two?
Transform length
119
A window size of power of two will result in...
Faster processing
120
What is windowing?
A series of short analytical snippets throughout duration of signal
121
Windowing describes...
The evolution of frequency over time
122
Window still has a problem. What is it?
Frequency and time resolution trade off
123
Time resolution can be increased by overlapping...
Windows
124
What do spectrograms plot?
Analytical window over time
125
What does a spectrograms X and Y axis show?
X = Time Y = Frequency
126
What does colour on a spectrogram represent?
Magnitude (Amplitude)
127
What problems does frequency analysis have? (4)
1. Results are estimates 2. Computationally expensive 3. Windowing can confuse frequency readings 4. Doesn't reflect human hearing
128
In terms of windowing, instead of reading signal spectrum, we get...
A combination of signal and window spectrum
129
What is 'SpEcTrAl LeAkAgE'?
Unwanted Artefacts
130
Spurious Components are referred to as...
Side lobes
131
How can we reduce unwanted artefacts?
Use different window shapes
132
FFT has a good frequency response at...
Low frequencies
133
As window decreases, frequency resolution...
Decreases
134
FFT has good time resolution...
Throughout whole spectrum
135
As window decreases, time resolution...
Increases
136
Time and frequency resolution trade off can be resolved by...
Using adaptive window sizes
137
In terms of multi-resolution analysis, smaller windows would be used for...
Higher frequencies
138
In terms of multi-resolution analysis, we aim to have good frequency resolution at...
Lower frequencies
139
In terms of multi-resolution analysis, we aim to have good time resolution at...
Higher frequencies
140
In terms of multi-resolution analysis, window size varies with...
Frequency
141
Whats the benefits of multi-resolution analysis? (2)
1. Resolves trade off 2. Can increase time and/or frequency resolution where it matters
142
Two key parameters of PCM are...
1. Sample rate 2. Bit depth
143
What is the formula for data per second using values from the following - 1 second of stereo PCM audio at 44.1kHz, 16 bit?
44,100 * 2 (bytes) * 2 (stereo) = 176.4kBps
144
What is the formula for bits per second using 176.4kB?
176.4kB * 8 = 1.4Mbps
145
What does perceptual audio aim to do?
Reduce data required to represent audio
146
What do we call the process of cochlea hairs responding to strongest stimuli and ignoring anything weaker?
Masking
147
Stimuli temporarily raises...
Threshold of hearing
148
What are critical bands? (The Beatles aren't one of them)
Areas influenced by the temporary change in threshold of hearing
149
Critical bands are ... at lower frequencies.
Narrower
150
What pattern appears across hearing range?
Constant Q pattern
151
What is the CB bandwidth equation?
CB bandwidth = 94 + ( 71 * f^3/2 ) f = kHz
152
CB bandwidth is not ... at frequencies.
Fixed
153
CB depends on what two components of stimuli?
Intensity and frequency
154
What does critical band response aid? (5)
1. Frequency discrimination 2. Perceived loudness 3. Dissonance/Consonance 4. Clarity of speech 5. Masking
155
Scales representing spectral energy in ... ... help measure human perception.
Critical bands
156
Two common scales of CB response are...
1. Bark 2. Mel
157
What does Bark scale aim to measure?
Loudness
158
One critical band has the bandwidth of how many barks?
One
159
What does Mel scale aim to measure?
Perceived pitch
160
What do Bark and Mel scales help us to establish?
Sounds both audible and inaudible in signal
161
Bark and Mel scales underpin...
Masking
162
Where does masking occur in terms of frequency?
Specific range in frequency around tone (critical band)
163
Masking means that frequency in same range might be...
Inaudible
164
In terms of masking, what is the 'Masker'?
Louder tone
165
In terms of masking, what is the 'Maskee'?
Quieter tone
166
Masking is better as frequency...
Increases
167
As masker amplitude increases, masking curve becomes...
Broader
168
In terms of masking, temporary threshold increase...
Holds over given time
169
Masking threshold increase lasts longer when... (4)
1. Masker is louder 2. Masker and maskee are closer in frequency 3. Masker has lower frequency than maskee 4. Time between tones are shorter
170
What is backwards masking?
Sounds can be masked by tone which occurs after maskee
171
Backwards masking suggests that humans hear in...
Time frames
172
Backwards masking only occurs when both tones are in...
The same time block
173
What is the bits per sample equation?
bits per sample = bit rate / Fs
174
What is the key mechanism in perceptual codec?
Bit allocation
175
What is data reduction?
Dynamically altering number of bits used to represent signal to make less computationally demanding
176
As bits decrease, noise...
Increases
177
In adaptive allocation, loud tones get what to represent them?
More bits
178
In adaptive allocation, what aren't encoded?
Inaudible tones
179
In adaptive allocation, what happens to quantisation error noise? How?
1. Its masked 2. By keeping under the threshold
180
What does compressed audio look like to a computer?
Instructions on how to reconstruct the waveform
181
Input frames are split into how many with signals with transients? How many samples does each segment frame have?
1. Three 2. 384 samples
182
Input frames are split into how many with static signals? How many samples does each segment frame have?
Trick question 1. One frame 2. 1152 samples
183
As frame size decreases, noise...
Decreases
184
Is encoding process perfect?
No
185
In smaller frames, what issue occurs around transients?
Noise
186
What happens when noise occurs before transient?
Transient is smeared resulting in loss of definition
187
Input signal is split into side bands. How many and what do each bandwidth have in common?
1. 32 2. Equal bandwidth
188
Sub bands result in...
32 separate band-limited time domain signals (really rolls off the tongue)
189
Do sub bands increase data?
Doesn't increase data due to 'polyphase sub-band filter'
190
What effect does the 'polyphase sub-band filter' have?
Down sampling effect
191
What does the 'polyphase sub-band filter' do? (2)
1. Reduces Fs 2. While splitting signal in sub bands
192
What sample frames are subject to frequency analysis?
All frames
193
What does frequency analysis do to sub band content?
Converts content into frequency domain data
194
What does MDCT stand for?
Modified Discrete Cosine Transform
195
How much data does MDCT need to reproduce data compared to FFT?
Half the data
196
In the frequency domain, the masking model level is calculated for...
Each sub-band
197
What does SMR stand for?
Signal to mask ratio
198
In the masking model, frequency domain data can be used to give us what ratio?
Signal to mask ratio
199
What are the stages of the masking model? (5)
1. Masking level calculated for each sub-band 2. Calculation for SMR 3. Bit allocation to sub-bands 4. No. of bits assigned to sub-bands dependent of SMR 5. Bit depth varies across sub-bands due to content
200
What is the encoding process order? (6)
1. Frames 2. Sub-bands 3. Down sampling 4. MDCT 5. Masking and bit allocation 6. Huffman coding
200
What is Huffman coding?
Statistical compression for further data reduction
200
What does Huffman coding represent?
Repeated sequences of data using shorter code eg 11010101 is stored as 01
201
What does compressed audio contain? (5)
1. Instructions for decoder 2. Samples in MDCT domain at reduced bit depth 3. Bit allocation data 4. Scale factor for each sub-band 5. encoded using Huffman coding
202
What processes do frequency transforms have?
Inverse equivalent processes
203
What does the decoder apply to produce a time domain signal?
An inverse MDCT
204
What is simpler, decoder or encoder?
Decoder
205
Compression artefacts are...
Complex
206
What happened to sub-band data when decoded?
Data is combined
207
What two ways do compression artefacts vary?
1. vary systematically with audio input 2. Vary according to encoding
208
Artefacts increase as bit rate...
Decreases
209
What is technical quality?
Our understanding of good audio quality
210
Audio engineers might want to access the output of what? (3)
1. Compression algorithms 2. Hardware systems 3. Network Codex
211
What is subjective audio quality assessment?
Listening tests taken by panel of listeners
212
What is objective audio quality assessment?
Analysis of audio signals - based on observational phenomena
213
What are the pros of subjective audio quality assessment? (1)
1. Most accurate results
214
What are the cons of subjective audio quality assessment? (4)
1. Expensive 2. Time consuming 3. Subjective 4. Complex planning
215
What are the pros of objective audio quality assessment? (4)
1. Lower cost 2. Lower complexity 3. Consistent (no listeners) 4. Less time required
216
What are the cons of objective audio quality assessment? (1)
1.It is an estimation of human response
217
In audio quality assessment, what two things are compared?
Original and processed signal
218
In terms of audio quality assessment, less change in signals means...
Better quality
219
Why aren't time domain comparisons helpful?
We aren't sensitive to phase changes
220
Comparing SNR, segmental SNR and total harmonic distortion don't resemble...
Human hearing response to these parameters
221
What does LSD stand for?
Log-squared spectral distance
222
What does LSD produce?
Large values for low power areas on spectrum
223
Whats the negative of LSD? (apart from the comedown)
It is too sensitive for spectral changes which are inaudible
224
What are meaningful differences?
Spectral features which characterise signals
225
What are formant peaks?
Cluster of energy around certain points in frequency
226
Formant can help us differentiate what two things?
1. Speech 2. Musical instruments
227
The human hearing range is sensitive to...
Formant peaks
228
What is the minimum change in frequency (%) humans can hear a difference in pitch? (worded that one badly, soz)
3-5%
229
Humans can hear a difference when bandwidth shifts ...-...%
20-40%
230
What does SKL stand for?
Symmetrical Kullback-Leibler Distance
231
What sort of coding does SKL use for a smooth formant based spectrum?
Linear prediction coding
232
SKL uses linear prediction coding to achieve a...
smooth formant based spectrum
233
What does SKL assume?
Formant changes will be perceivable
234
SKL emphasises differences in what two parameters?
1. High magnitudes 2. Low frequencies
235
SKL is less sensitive to...
High frequency shifts
236
What does MFCC stand for?
Mel Frequency Cepstral Coefficients
237
MFCC is a subjective spectrum which reflects...
How we hear sounds
238
What does MFCC use to reflect how we hear sounds?
Psychoacoustical phenomena
239
Cepstrum is equal to...
Inverse FFT of the log FFT of a signal (duh)
240
What is inverse FFT of the log FFT of a signal equal to?
Cepstrum
241
What does cepstrum emphasise?
Pitch content
242
What does MFCC combine? (2)
1. Cepstrum 2. Mel
243
Changes in MFCC are...
Perceivable
244
What is the auditory transform stage?
MFCC
245
What is the gear meshing equation?
Fm = nt * Frg nt = no. of teeth Frg = speed of gear
246
Periodograms help emphasise...
Pitch
247
What are the three stages of the PSD process?
1. Compare signals with itself 2. Take FFT of results 3. Peaks will be produced at frequency of periodic elements
248
What is acoustic ecology?
Environmental sound
249
What does NVH stand for?
Noise, Vibration, Harshness
250
What is an example of active sound design?
Ford Mustang mic up engine and gives user option to change between sports and comfort mode (changing volume of 'engine')
251
Product sound impacts... (3)
1. Perceived quality 2. Purchase 3. Design and manufacture
252
What is cross modal perception? An example?
1. When perception is affected by two or more senses 2. Louder = more powerful
253
Perception is measured by... (3)
1. Loudness 2. Roughness 3. Sharpness
254
What does loudness measure and what is its unit?
Measure of energy across critical bands (Sone)
255
What does roughness measure and what is its unit?
Rapid amplitude fluctuations by interacting sounds (Asper)
256
What does sharpness measure and what is its unit?
Weighting/shape of spectrum (Acum)
257
What two parameters are in response of critical bands?
1. Roughness 2. Sharpness
258
As frequency energy increases, sharpness...
Increases
259
Where does sharpness occur?
In one critical band with concentrated high frequency energy
260
What is the term for low frequency sharpness?
Booming
261
What does CSA stand for?
Category Scaling of Annoyance
262
What is CSA used for?
Measuring annoyance of sound
263
What is the CSA equation?
CSA = 8.07 + ( 0.563 * N5 ) + ( 3.022 * S50 ) + ( 2.175 * R ) N = Loudness S = Sharpness R = Roughness
264
What does MIR stand for?
Music Information Retrieval
265
What is an example of tech that uses MIR?
Melodyne
266
What is the issue with query by humming?
Variation of time and pitch in humming might not be recognised
267
What is the solution to the 'query by humming' issue?
Parsons code
268
What is parsons code?
Codes notes changes so that system recognises C, C#, D as tonic, up, up (sorry if that ones confusing)
269
What is 'query of example'? Give an example of tech that uses it.
1. Looks for closest match by extracting compact and descriptive set of acoustic features 2. Shazam
270
What are the challenges of 'query by example'? (3)
1. Database has millions of files so data must be compact 2. Fingerprints must be robust enough to ignore noise 3. Process must be efficient
271
What do constellation maps do? (2)
1. Finds local maxima (peaks) 2. Encodes peaks as time and frequency coordinates
272
What would you use if peaks overlap in constellation maps?
Use hashing process
273
What does the hashing process do? (2)
1. Helps identify spectral features unique to music track 2. Speeds up process
274
What three forms of analysis can be used for classification?
1. Audio 2. Metadata 3. Symbolic Data
275
In term of classification, what are the two benefits of using audio data?
1. Easy to get ahold of 2. Can extract timber and acoustics easily
276
In term of classification, what is the con of using audio data?
Hard to precisely identify some features
277
In term of classification, what is the benefit of using symbolic data?
More detailed
278
In term of classification, what are the cons of using symbolic data? (2)
1. No acoustic/timbre data 2. Difficult to represent whole song in MIDI
279
What information can be gathered from spectrograms? (4)
1. Timbre 2. Frequency 3. Intensity 4. Rhythmic features
280
What are the two approaches of audio content analysis?
1. Spectrogram 2. Frame-based approach
281
Spectral shape gives us what four parameters? (4)
1. Brightness 2. Centroid 3. Flatness 4. Skewness
282
What is spectral flux?
Change of spectra over time
283
What would you use to identify chords in audio? (2)
1. Spectrogram 2. Pitch histograms
284
How would you identify chords in audio?
Calculate average energy for each note across spectrum
285
What does 'classify by content' do?
Classifies high level content using low level parameters
286
What is the 'classify by content' process? ( 4)
1. Get audio 2. Group 3. Find ground truths 4. Classify using ground truths
287
When gathering audio for classification, what should the audio be?
Typical to the genre
288
What are three methods of pattern machine learning?
1. KNN 2. GMM 3. SUM
289
What does KNN stand for?
K Nearest Neighbour
290
What is the KNN equation?
KNN = square root of ( A - B ) ^2
291
In terms of KNN, if K = 3, how many smallest distance tracks would you choose?
Three
292
In terms of KNN, as K increases, neighbours should...
Increase
293
In terms of KNN, less neighbours can produce...
Clearer boundaries
294
In terms of KNN, the more neighbours there are...
The better the class represents
295
What is the content based problem?
Acoustical properties aren't taken into account so there might be similarities in acoustics rather than music
296
What is the content based problem called?
Glass Ceiling