Using Keyword Spotting on DFR1154 AI Camera

JpEncausse · April 23, 2025, 8:47pm

Question/Issue:
I want to implements Edge Impulse AI Edge Model for DF Robot 1154 AI Camera. Here is a first demo with ChatGPT.

I want to trigger KeyWord Spotting before capturing and sending audio to ChatGPT. I followed Edge Impulse Tutorial + Tour but I struggle at the end implementing to the ESP32. And ChatGPT hallucinate at providing the right answers … the board reboot when I call run_classifier().

Project ID:
My project ID should be 678141 I folllowed all the step generating a “Sarah” keyword with the default value “Quantized”

Context/Use case:
I’m new to Edge Impulse and just follow the tutorial because I wan tot give a try on AI on Edge and it seems the best way to do it

The goal is to perform Keyword Spotting then capturing audio and send it to ChatGPT. See the youtube video demo.

Steps Taken:

I follow the tutorial train on “Sarah” keyword
Get the default zip Quantized I assume
Throw the library + code to make it work

I think (and ChatGPT is obssessed) the problems comes

from a memory issue or size because of the model (I don’t believe)
from memory management or leak with malloc and
from the usage 16bit vs 8bit etc …

Expected Outcome:
I expect to get a value corresponding to the keywork spotting.
Then I’ll put the code into a task
Then I’ll trigger the rest of my code that send audio to ChatGPT, take picture, …

Actual Outcome:
The board reboot on run_classifier()

Reproducibility:

[x ] Always

Environment:

Platform: Arduino v3 on DFR1154_ESP32_S3_AI_CAM
Build Environment Details: PlatformIO on VSCode
OS Version: Windows 11
Edge Impulse Version (Firmware): ? Major 1 / Minor 71 / Patch 29
Project Version: 2

Logs/Attachments:
No logs, it compile correctly and fails at Serial.println("Run classifier...");

Additional Information:

I’m using PlatformIO hard time figuring out I just have to declare the zip all by it self in platformIO. It is configured to use Arduino v3 and so ESP_I2S.h

[platformio]
src_dir = src

[env:esp32-s3-aicam]
platform = https://github.com/pioarduino/platform-espressif32/releases/download/stable/platform-espressif32.zip
board = esp32-s3-devkitc1-n16r8
framework = arduino

build_flags = 
    -w
	-DBOARD_HAS_PSRAM
	-DARDUINO_USB_CDC_ON_BOOT=1
	-DCORE_DEBUG_LEVEL=1 
	-DCONFIG_WIFI_ENABLED

lib_deps = 
        mathertel/OneButton@^2.6.1
	gilmaimon/ArduinoWebsockets@^0.5.4
	bblanchon/ArduinoJson@^7.3.1
	./lib/keyword-spotting-v2.zip

With the help of ChatGPT. I use #include “ESP_I2S.h”

#include "ESP_I2S.h"
#include "edge-impulse-sdk/classifier/ei_run_classifier.h"
#include "model-parameters/model_metadata.h"
#include "tflite-model/tflite_learn_5_compiled.h"
#define EI_CLASSIFIER_USE_QUANTIZED 1
#define EI_MAX_AUDIO_SAMPLES 16000  // 1 seconde à 16kHz
static int8_t audio_data_int8[EI_MAX_AUDIO_SAMPLES];
static float audio_data_float[EI_MAX_AUDIO_SAMPLES];

void runEdgeImpulse() {
  Serial.println("Record 1 second of audio...");
  size_t wav_size = 0;
  uint8_t* wav_buffer = i2s_rec.recordWAV(1, &wav_size);
  if (!wav_buffer) {
    Serial.println("Erreur wav_buffer");
    return;
  }

  Serial.println("Prepare audio...");
  int16_t* audio_data_raw = (int16_t*)wav_buffer;
  size_t sample_count = wav_size / sizeof(int16_t);
  if (sample_count > EI_MAX_AUDIO_SAMPLES) sample_count = EI_MAX_AUDIO_SAMPLES;

  Serial.println("Convert int16_t to int8_t...");
  for (size_t i = 0; i < sample_count; i++) {
    int val = audio_data_raw[i] >> 8;
    if (val > 127) val = 127;
    if (val < -128) val = -128;
    audio_data_int8[i] = (int8_t)val;
  }

  Serial.println("Convert int8_t to float...");
  if (numpy::int8_to_float(audio_data_int8, audio_data_float, sample_count) != 0) {
    Serial.println("Erreur int8_to_float");
    free(wav_buffer);
    return;
  }

  Serial.println("Build signal...");
  signal_t signal;
  if (numpy::signal_from_buffer(audio_data_float, sample_count, &signal) != 0) {
    Serial.println("Erreur signal_from_buffer");
    free(wav_buffer);
    return;
  }

  Serial.println("Run classifier...");
  ei_impulse_result_t result = {};
  if (run_classifier(&signal, &result, false) != EI_IMPULSE_OK) {
    Serial.println("Erreur run_classifier");
    free(wav_buffer);
    return;
  }

  Serial.println("Print results...");
  for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
    ei_printf("%s:\t%.5f\n", result.classification[ix].label, result.classification[ix].value);
  }

  free(wav_buffer);
}

The ei_shim.cpp requires

extern "C" {
    #include <stdlib.h>
    #include <stdio.h>
    #include <stdarg.h>
    #include "esp_timer.h"

    void* ei_malloc(size_t size) { return malloc(size); }
    void* ei_calloc(size_t n, size_t size) { return calloc(n, size); }
    void ei_free(void* ptr) { free(ptr); }

    void ei_printf(const char *format, ...) {
        va_list args;
        va_start(args, format);
        vprintf(format, args);
        va_end(args);
    }

    void ei_printf_float(float f) { printf("%f\n", f); }

    uint64_t ei_read_timer_us() { return esp_timer_get_time(); }

    bool ei_run_impulse_check_canceled() { return false; }
}

JpEncausse · June 4, 2025, 12:12pm

Did someone have some clues or direction ?