gnes.preprocessor.audio.vggish_example module¶
-
class
gnes.preprocessor.audio.vggish_example.
VggishPreprocessor
(num_frames=96, num_bands=64, sample_rate=16000, log_offset=0.01, example_window_seconds=0.96, example_hop_seconds=0.96, stft_window_length_seconds=0.025, stft_hop_length_seconds=0.01, mel_min_hz=125, mel_max_hz=7500, *args, **kwargs)[source]¶ Bases:
gnes.preprocessor.base.BaseAudioPreprocessor
-
train
(*args, **kwargs)¶ Train the model, need to be overrided
-
waveform_to_examples
(data, sample_rate)[source]¶ Converts audio waveform into an array of examples for VGGish.
- Args:
- data: np.array of either one dimension (mono) or two dimensions
- (multi-channel, with the outer dimension representing channels). Each sample is generally expected to lie in the range [-1.0, +1.0], although this is not required.
sample_rate: Sample rate of data.
- Returns:
- 3-D np.array of shape [num_examples, num_frames, num_bands] which represents a sequence of examples, each of which contains a patch of log mel spectrogram, covering num_frames frames of audio and num_bands mel frequency bands, where the frame length is vggish_params.STFT_HOP_LENGTH_SECONDS.
-