Ways to Increase Accuracy of EI model on ESP32 TTGO T-Display Board

Im looking for ways to Increase the Accuracy of AUDIO Classification on ESP32 TTGO T-Display Board

Project ID : 628947

Board Limitation I have encountered so far

-ESP32 TTGO T-Display

  • No PSRAM

  • Max EI Raw Sample Count so far: 52,920

  • Max EI Window size so far: 1200ms

  • Max EI Window Stride so far: 800ms

  • Window Slice per Model : 4

  • Classifier Interval (in ms) : 0.02

Components:

  1. ESP32 TTGO T-DIsplay

  2. Inmp441 Mems Mic

Target Audios to be classified:

  1. Emergency Vehicles (Ambulance, Firetruck, Police Car - anything with sirens)
  2. Gun Shots
  3. Car Horns
  4. IDLE (added this for background noise such us whispering, or any harmless sounds that commonly found on outdoor)

I need suggestions on the following:

  1. Processing Block Parameters
  2. Learning Block:
Training Settings
Advance Training settings
Audio Training Options
Neural Network Architecture

Also suggestions to use Expert Modes ?

Hi @dasigjp2017

If you are not already doing so try using MFCC or MFE features, experiment with window size (typically 500-1500ms) and stride to balance temporal resolution and memory.

Please see our model performance guide here for more general infromation or the block pages. - Increasing model performance | Edge Impulse Documentation

Best

Eoin

I tried MFCC with window size of 1200ms and increase of 200ms . beyond this the board starts rebooting non stop.
For now I have started to try other processing blocks like Spectrogram and next will be spectral analysis and raw data. I will train and test the model that can fit my board limits and have high accuracy in real world scenarios

Im still having a hard time understanding the Neural Network Architecture’s configurations and Im still studying on better sequence and blocks I can use to improve the models capability to classify sound and will still fit the board Im using

Hmm you can also try another configuration of blocks e.g. Spectrogram and Raw Audio see if that reduces the size a bit, but I’m not sure about the TTGO board, it may have some portion reserved for display?

Best

Eoin

These past few days I tried two use 2 processing bloc
MFCC and MFE matching both their parameter as close as possible

and fortunately, the accuracy in real word testing have increase
from 40% to at least 85% accuracy.

I have 4 labels in my dataset that ranges up to 10 to 15 min each in training and 2 to 4 min in testing