Speech Recognition on ESP32 with I2S microphone (INMP441)

check the pin configuration on your end. you need to edit that part.

also esp32 have memory alloc limitation take that in mind when training models in edge impulse.

If I remember it correctly its around 52920 raw samples. dont exceed from that point

the result of the accuracy of model in Edge Impulse and in the real word testing is really different.

You need to test and test and test to get the result you want here. no need to modify much on the code. you need to train model non-stop to get result you want on the real world applications


With the 5kHz, I was able to get 95% accuracy in the model testing, But the real time classification was way out. I think my code was not handling the buffer correctly.

use two processing blocks

MFE and MFCC at same time . this increases real world accuracy for me

I did use 2 processing blocks. Again I’ll repeat what I’ve said earlier…I have a great impulse model. When I test it with my phone’s speaker it works flawlessly. It works like crap on my esp32 with its microphone. I seen to be having a hard time making that point.

Hi @se732525

Can you share the DSP configuration from the Signal Window page and your create impulse page?

Downsample Training Data

  • Train your model on 5kHz or 8kHz downsampled versions of your data (match what the ESP32 can actually capture.)
  • This helps the model learn in the same spectral context the device will use.

e.g.


Did you record a sample with the ESP 32 mic to confirm the WAV is collected as expected?? Try capturing a sample raw to test with.

Improve DSP Configuration

  • Use MFE or MFCC, not both because the memory is not sufficient.
  • MFCC config to try:
    • Frame length: 0.02
    • Frame stride: 0.01
    • FFT length: 512
    • Number of coefficients: 13
    • Filter count: 32
    • Low freq: 300, High freq: 3800
    • Pre-emphasis: 0.95

Best

Eoin