Is it possible to limit the input audio frequency range for MFCC and NN model?

oveddan · March 1, 2020, 11:48pm

I have audio samples that have a lot of noise on the low end:

The above shows an example of a footstep audio clip. I would like to run the MFCC process excluding audio from the lower frequencies - for example with the above audio sample, since the noise occures up to around 300hz, I’d want to exclude audio from 300hz and below.

Ideally, the model would then be trained on the results of MFCC from those frequencies, and inference would only run on audio in those frequencies.

Is it possible with edge impulse to specify which audio frequencies to include/exclude?

janjongboom · March 2, 2020, 9:07am

Hi @ovedan, good idea actually. We naturally have these options already but don’t expose them at the moment (see here in the inferencing SDK for example). Let me see if we can add them to the UI without too much work.

edit: It’s actually not entirely clear if this would solve your problem or that you’d rather have a bandpass filter on the raw data. We’re releasing custom DSP blocks to the world somewhere this month which would allow you to plug random filters in before as well.

janjongboom · March 2, 2020, 9:24am

OK, that was easier than expected. Hope to release this somewhere this week:

janjongboom · March 2, 2020, 9:45am

FYI, by digging through the code we actually already filter out the bottom 300Hz, so you should be able to proceed without any changes:

github.com

edgeimpulse/inferencing-sdk-cpp/blob/009127dc91843b2fd0aebd78b4f6833a6fdcfdb4/dsp/speechpy/feature.hpp#L182


 * @param low_frequency (float): lowest band edge of mel filters.
 *     In Hz, default is 0.
 * @param high_frequency (float): highest band edge of mel filters.
 *     In Hz, default is samplerate/2
 * @EIDSP_OK if OK
 */
static int mfe(matrix_t *out_features, matrix_t *out_energies,
    signal_t *signal,
    uint32_t sampling_frequency,
    float frame_length = 0.02f, float frame_stride = 0.02f, uint16_t num_filters = 40,
    uint16_t fft_length = 512, float low_frequency = 300.0f, float high_frequency = 0.0f
    )
{
    int ret = 0;


    if (high_frequency == 0.0f) {
        high_frequency = static_cast<float>(sampling_frequency) / 2;
    }


    stack_frames_info_t stack_frame_info = { 0 };
    stack_frame_info.signal = signal;

Will push the changes regardless, so these limits will be flexible.

oveddan · March 3, 2020, 3:06am

Yes a bandpass filter would be great, and awesome you already got something like that working!! It would be great if on the MFCC page there was some sort of documentation as to what each parameter does.

Are you saying that it already filters out things lower than 300hz, or it’s something you’ll do for next release?

Also, regarding using the filters in the inference SDK - does this mean filters only are applied at inference time? What happens then if the model is trained on the entire spectrum?

janjongboom · March 3, 2020, 8:44am

Are you saying that it already filters out things lower than 300hz, or it’s something you’ll do for next release?

The current behavior is that the band edges for the Mel filters are 300Hz - samplerate / 2 (so 8000Hz if sample rate is 16KHz), so my feeling is that noise under 300Hz won’t show up. This is the default behavior both in the studio and in the inferencing SDK. In the next release (update: released now) we make these edges configurable in the UI.

Also, regarding using the filters in the inference SDK - does this mean filters only are applied at inference time? What happens then if the model is trained on the entire spectrum?

Unfortunate word choice on my end! The inferencing SDK and the studio behave exactly the same, so every window that we process in the studio has the same filter applied.

oveddan · March 4, 2020, 11:57pm

I see this feature has been deployed. thank you!

oveddan · March 8, 2020, 11:42pm

I find this feature useful. It would be great to be able to hear what the sound is like with the bandpass filter applied.

oveddan · March 8, 2020, 11:48pm

Maybe you can help debug my issue. In the spectrogram of my Daw, the audio signal of a sound is clear:

You can see from the picture that the sound occurs between approximately 3-3.5k

I apply this as the bandpass filter in the mfcc step:

Yet that audio signal does not appear visible at all in the resulting image.

What am I doing wrong? Do I need to amplify the signal since it is weak? I’m basically trying to classify a bird chirping sound that is distant.

janjongboom · March 9, 2020, 10:26am

It’s not really a bandpass filter there, but rather the begin and end of the MFCC buckets. I think these need more of the signal to work with then 500Hz. We’re releasing custom blocks somewhere this week, that would allow you to plug in a proper bandpass filter in quickly.