Waveform Processing Flashcards

(17 cards)

1
Q

Significance of Frame Size

A
  • Need to ensure a sufficient amount of frames to capture non-stationary properties (not smoothed over)
  • BUT taking a signal out of a signal can introduce information that was never there
  • SO we must use overlapping frames
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Overlapping Frames

A
  • frame size N, number of samples per frame (or NT seconds)
  • frame shift R, no of samples between successive frames
  • frame rate 1/RT, frames per second fps
  • T=1/fs is sample period
  • fs = sample rate
  • overlap = N-R

8000Hz = 8000 samples for 1 second of speech

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sampling Rate

A

Number of samples per second in Hz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sample Period

A

1/sampling rate

Seconds per sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Frame Length

A
  • No. of samples per frame
  • N = NT/T
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Frame Shift

A
  • No. of samples between starts of successive frames (R)
  • R = 1/frT
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Windowing Purpose

A
  • Reduce edge discontinuities from framing, ensuring smooth transitions between frames
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Window Functions

A
  • Multiply each signal frame with a window function
  • Most common are raised cosine window functions e.g. Hamming and Hanning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Energy

A
  • Large in voiced speech e.g. open vowel sounds
  • Short term energy is per block
  • E = sum (squared sample values)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Normalised Energy

A
  • E = 1/N sum (squared sample values)

Removes sensitivity based on no. of samples in analysis frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Zero-Crossing Rate

A
  • Large in unvoiced sounds e.g. p, ch
  • No. of times the x-axis has been crossed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Speech/Non-speech detection

A

Simple version can be constructed using short-time energy (high in voiced speech) and zero-crossing rate (high in unvoiced speech)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Auto-Correlation

A
  • Correlation with itself
  • Emphasises periodicity, pitch is static within this
  • Finds repeating patterns
  • Basis for pitch detectors
  • Expensive to compute as inner loop running for every sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Auto-Correlation Function

A

r[k] = 1/N sum (s[i] * s[i+k])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Cepstral Analysis

A
  • For separating vocal tract filter response from excitation spectrum (source)
  • Frequency spectrum = vocal tract frequency response x excitation response
  • Break into two components to make it more useful
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pitch Estimation with Cepstrum

A

Cepstrum - compute log spectrum of relationship

pseudo-frequency domain

Quefrency graph - find peak cepstral value in high quefrency components

17
Q

Frame Rate