Speech Recognition on ESP32 with I2S microphone (INMP441)

I cannot make my microphone work with my EdgeImpulse project. I have created my project and downloaded it using the Espressif ESP-EYE target. My board is an ESP-32 Dev Kit and the project runs fine with it (except for a few compiler warnings). I have tested the model by using features that I received from the MFCC raw data and it works great! I’m confident that the /edgeimpuse folder is in great shape.

Now on to the problem. My I2S microphone (INMP441) data gets collected but it either fails or creates false positive results from the downloaded model. I have tested the I2S connections by creating a simple program using the same settings as is used in my model. the i2s_pin_config_t and i2s_config_t settings check out and the microphone returns my voice just fine. I’m pretty confident this is not a pin connection or I2S configuration issue.

My question is very simple, has anyone found a solution that actually gets an ESP32 using an external I2S microphone to work? I have found a few 3-4 year old projects online that claim to do this with TensorFlow but not with EdgeImpulse. I have gone to the extreme of taking the microphone code from these projects and used it in my project but no luck. Why is this last step in the process such a huge problem?

Thank you for your thoughts in advance.

Hello, I successfully run a (audio)model from Edge Impulse on my esp32 TTGO T-DIsplay (1.14inc) with no Psram
Can you tell me more about your problem, maybe I can help.
Can you show error message displayed if there are?

My board is Esp32 TTGO T-Display with Inmp441 mems mic (omni)

Thank you for responding to my post. I have an Esp32-wroom board (esp32dev) and have been working on platformio using the Arduino framework. I have also used the espidf framework. Me audio impulse works great when I test it with raw data in the features variable.
What I could really use is code that collects data from the inmp441 and successfully applies it to the edgeimpulse model I have.
Thanks

Once you have downloaded the Model. add it to your arduino IDE library.
Use the example from the model
filename : model-name/examples/esp32/esp32_microphone/esp32_microphone.ino

Modify the part where you need to configure the pins you used on your inmp441 mic

and upload to your board. If you have any error paste it here so I can help you fix the problem that you will encounter.

In my end,

i2s_config_t i2s_config = {
      .mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX ),
      .sample_rate = sampling_rate,
      .bits_per_sample = (i2s_bits_per_sample_t)16,
      .channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
      .communication_format = I2S_COMM_FORMAT_I2S,
      .intr_alloc_flags = 0,
      .dma_buf_count = 8,
      .dma_buf_len = 512,
      .use_apll = false,
      .tx_desc_auto_clear = false,
      .fixed_mclk = -1,
  };
  i2s_pin_config_t pin_config = {
      .bck_io_num = 27,    // IIS_SCLK
      .ws_io_num = 26,     // IIS_LCLK
      .data_out_num = -1,  // IIS_DSIN
      .data_in_num = 25,   // IIS_DOUT
  };

this is the only part I needed to modify to test if the code is working. make sure that your pin connection is accurate.

I apologize for being dense but where is the ino file you mention located?

ok I think I found it, you are referring to the samples folder in the zip file downloaded from EdgeImpulse. I’ll explore it. I think I did before but I’ll give it another go because it seems to work for you.

Can you tell me your sample rate? Mine is 96000 and the esp32_microphone. needs sizeof(int16) times that in a malloc call. My esp32 does not have enough memory for that. I have tried the esp_32_microphone_continuous which only needs a short instead of the int16 and it builds and runs fine but the INMP441 still seems to be giving it garbage.
Thanks

My board can only work at maximum

52,920 sample rate.
window size 1200ms ; window increase 200ms
esp32_microphone.ino also use only this. the continuous example always overflows and cant run on my board.

(Don’t exceed this setup)

also before you upload the code,
at the end of the loop ( void loop argument) put a delay of at least 2 sec . 
so it the board will have enough time to process the audio without overflowing.

once you have successfully run the model you can play the parameters in Edge Impulse.
You also use 2 processing block and 1 learning block in a single Edge Impulse model.

note that 2 build this more than 2 processing block, you will need to use the free trial offered by Edge Impulse, since building model in free mode has a max time of around 20mins.

if you need help in parameters that work on MFCC I can give you so can run model to your board

For MFCC Parameter you can use this as a guide

Number of coefficients
13
Frame length
0.025
Frame stride
0.02
Filter number
32
FFT length
2048 ( decrease this by dividing by 2 if too much )
Normalization window size
151
Low frequency
400
High frequency
Not set

Pre-emphasis
Coefficient 0.98

this parameter may have exceed the job limit of 20min.
Enterprise plan have no Job limit time.
If you want to be able to have access to enterprise plan of edge impulse you can do this

use a temporary domain email, you can get 1 at wix.com. just create account there and proceed in the creation and generate a website after that, you get the temporary email provided by wix and you can use it at Edge Impulse to have access to Enterprise Plan ( 14 day trial )

You can also refer to this thread I created where I experimented on my own at first while figuring things out with Edge Impulse.

“ESP32 INMP441 Stuck at microphone_inference_record(); – No Error, No Data Received”

Thank you so much with this information. I have successfully got the INMP441 working now. I find that the continuous one works better for me. Where I am at now is trying to figure out how to make my impulse better. Here is my current jamb and maybe you have and idea. I have a 14 word set I’m trying to deal with. 5 of those words are in the Google voice library which has 3 thousand or more wave files for each. These words are working pretty well for my application. The other 9 I’ve had to create using my phone and my wife and I saying them. I only have about 40 files for each. Needless to say these are not being hit very well at all. Do you have any suggestions? Is my problem just the file count causing problems or is there some setting that I need to set when I make my impulse?
Again thank you for your ideas. I’m glad you “forced” me to use an Arduino IDE library approach to my coding. I’ll get that transferred over to platformio once I’m confident I’ve got a working soution.

increasing the accuracy?
On my latest experiment on increasing the edge impulse’s model,
I tried to use 2 processing blocks (MFE and MFCC) and 1 learning block ( classifier)
after tinkering on each o their parameters I successfully increase the accuracy from 60 to 85% in actual testing. this method will require you to have enterprise (I recommend for free trial) plan or the plus( paid ) plan to train since free user only allows up to 20min to 30 min job limit on Edge Impulse.

for the datasets, I suggest an equal amount(minutes) for each sounds to decrease bias.
also set in the classifier tab ( learning Block) increase the training cycles to 100 or 150 and the decrease learning rate 0.001

on my projects , the sounds I use have less to none background sounds as much as possible too.