Ways to Increase Accuracy of EI model on ESP32 TTGO T-Display Board

Im looking for ways to Increase the Accuracy of AUDIO Classification on ESP32 TTGO T-Display Board

Project ID : 628947

Board Limitation I have encountered so far

-ESP32 TTGO T-Display

  • No PSRAM

  • Max EI Raw Sample Count so far: 52,920

  • Max EI Window size so far: 1200ms

  • Max EI Window Stride so far: 800ms

  • Window Slice per Model : 4

  • Classifier Interval (in ms) : 0.02

Components:

  1. ESP32 TTGO T-DIsplay

  2. Inmp441 Mems Mic

Target Audios to be classified:

  1. Emergency Vehicles (Ambulance, Firetruck, Police Car - anything with sirens)
  2. Gun Shots
  3. Car Horns
  4. IDLE (added this for background noise such us whispering, or any harmless sounds that commonly found on outdoor)

I need suggestions on the following:

  1. Processing Block Parameters
  2. Learning Block:
Training Settings
Advance Training settings
Audio Training Options
Neural Network Architecture

Also suggestions to use Expert Modes ?

Hi @dasigjp2017

If you are not already doing so try using MFCC or MFE features, experiment with window size (typically 500-1500ms) and stride to balance temporal resolution and memory.

Please see our model performance guide here for more general infromation or the block pages. - Increasing model performance | Edge Impulse Documentation

Best

Eoin

I tried MFCC with window size of 1200ms and increase of 200ms . beyond this the board starts rebooting non stop.
For now I have started to try other processing blocks like Spectrogram and next will be spectral analysis and raw data. I will train and test the model that can fit my board limits and have high accuracy in real world scenarios

Im still having a hard time understanding the Neural Network Architecture’s configurations and Im still studying on better sequence and blocks I can use to improve the models capability to classify sound and will still fit the board Im using