Keyword Spotting on ESP32

There is a video on YouTube channel which is demonstrated by Shawn Hymel. It is about how to do keyword spotting on Arduino nano 33 ble sense. I am doing a project where music is played through Bluetooth. So I can’t use Arduino Nano 33 BLE sense as it has BLE. I have decided to go with ESP32 with INMP441 MEMS microphone.
I figured out that PDM.h library in continuous_speech_recognition example code is for builtin mic on Arduino Nano 33 BLE sense.
Now how can I change those lines of code with i2s.h library since I am using I2S microphone?
And What other lines should I include in the code to interface microphone and also replace? I am really new to microcontrollers and a beginner. It would be really helpful for me to learn.
Thank you

Hi @aviator2710,

You can check this example project using the same microphone: GitHub - happychriss/edgeML_esp32_audio_sampling: Continuous speech recognition with ESP32 for numbers (0--9) using a 1D CNN / MFCC, ESP32 (Lolin D32 Pro + INMP441 I2S Microphone) using EdgeImpulse Framework
The project uses the platformIO but it should not be an issue to adapt it using Arduino IDE.


Thanks for your reply. The github project link given by you uses wifi for speech recognition. I plan to do it locally on ESP32 without using wifi. I saw another post on this forum . It has same problem as mine. Here’s the link.

But I don’t understand what changes the person has done in the Arduino nano code. The person says that everything is same except callback. I don’t know what is that.

Hi @aviator2710,

Who is this Shawn Hymel character? :wink:

In my experience, getting PDM or I2S working is a pain (without the help of a library) and requires some fairly advanced knowledge about the processor you’re working with.

I recommend starting with a simple example (e.g. GitHub - atomic14/esp32-i2s-mic-test: The Simplest Test Code for an I2S Microphone on the ESP32 I can Imagine) to learn how to manipulate the registers in the ESP32 to get I2S working for microphones. Adjust that code as necessary to get it working with your particular microphone.

Then, to do continuous keyword spotting, you’ll need to incorporate DMA, which will pipe data from your I2S peripheral to memory directly without CPU intervention. This video might help illustrate I2S and DMA working together on the ESP32: ESP32 Audio DMA Settings Explained - dma_buf_len and dma_buf_count - YouTube.

From there, you’ll need to use the run_classifier_continuous() function to perform your continuous keyword spotting. I have a few examples that do that here (ei-keyword-spotting/embedded-demos at master · ShawnHymel/ei-keyword-spotting · GitHub), but nothing for the ESP32. I have not worked with I2S or DMA in the ESP32, so my knowledge of getting them to work with the C++ SDK library on the ESP32 is quite limited.


I’m really happy and surprised to see you reply on my issue!!!
Thank you so much for giving your time to help me out. I will check the links mentioned by you and get back to you if I face any issues.

1 Like

This project mostly aligns with my project but it uses wifi to process captured data. Do you have any idea about how do I remove that part of code and do it locally on ESP32?