Iβm developing an audio classification project using the Edge Impulse SDK on an ESP32-S3 with ESP-IDF. I am encountering a severe inference issue after replacing the default TFLite file with a custom model.
The issue is that my custom-trained model, despite matching the input/output dimensions, consistently returns a near-constant, biased prediction for one class, and performance is extremely poor.
Question/Issue:
I replaced the original TFLite model (likely a MobilenetV2) in the SDK example with my own custom EfficientNet Lite model. The generated SDK files show the original model was using 8-bit integer quantization (Int8) for the learning block and classification.
My custom TFLite model was trained externally in TensorFlow. Even after converting it to TFLite (I have tested both Float32 and an equivalent Int8 quantized version), the inference on the ESP32-S3 using the Edge Impulse SDK runtime returns an almost constant, high prediction for class 1 (approx. 0.99973), regardless of the audio input.
The logs show:
I (327196) EI_TASK: Timing: DSP 78 ms, Inference 232 ms, Anomaly 0 ms I (327196) EI_TASK: Predictions: I (327196) EI_TASK: classe 1: 0.99973 I (327196) EI_TASK: classe 2: 0.00027
(The 9489 ms inference time is also unacceptably high for this platform.)
Project ID:
840129 (Extracted from ei_classifier_model_variables.h)
Context/Use case:
Real-time audio classification (distinguishing βdabiβ vs. βoi_dabiβ). I need to deploy an efficient, custom-trained model (EfficientNet Lite) on a low-power, constrained microcontroller (ESP32-S3).
Steps Taken:
- Downloaded the Edge Impulse Audio Classification SDK example.
- Trained a custom EfficientNet Lite model offline in TensorFlow using a balanced audio dataset.
- The DSP settings (Audio MFE) used in my TensorFlow pipeline match the configuration in the generated code (e.g.,
num_filters=40,frame_length=0.02f, etc.). - Converted the model to TFLite. I have been careful to ensure the input/output shapes are correct.
- Replaced the original TFLite file with my custom TFLite.
- Compiled and deployed using ESP-IDF for the ESP32-S3.
Expected Outcome:
The EfficientNet Lite model should load and perform inference correctly, yielding varying probability scores that accurately classify the input audio, reflecting its strong performance in the validation set.
Actual Outcome:
The model produces a near-constant output for class 1 and exhibits an extremely long inference time (~9.5 seconds).
Reproducibility:
- [x] Always
Environment:
- Platform: ESP32-S3
- Build Environment Details: ESP-IDF
- OS Version: [e.g., Ubuntu 22.04, Windows 10 - Please fill this in]
- Edge Impulse Version (Firmware): [Please find this version]
- Edge Impulse CLI Version: [Please find this version]
-
Project Version: 2 (Extracted from
deploy_version) -
Custom Blocks / Impulse Configuration:
- DSP Block: Audio MFE (Input Size: 3960 features).
-
Model Configuration Details (from generated code):
-
quantized = 1(Model is expected to be Int8 Quantized) postprocess_fn = &process_classification_i8-
zero_point = -128,scale = 0.00390625(Crucial post-processing parameters)
-
Logs/Attachments:
[Include the TFLite file if possible, or a screenshot of the Netron viewer showing its input/output layers and data types (Int8 or Float32).]
Additional Information:
Given the constant output and the high inference time, I strongly suspect a quantization mismatch or data handling issue in the Edge Impulse TFLite runtime:
-
Quantization Mismatch: Even if I provide an Int8 TFLite model, the Zero Point and Scale values used by the original EI model (
zero_point = -128,scale = 0.00390625) may not match the zero point and scale of my custom Int8 TFLite model, causing incorrect dequantization and constant output. - Model Compatibility: The TFLite runtime on the ESP32-S3 might not fully support all EfficientNet Lite operators in the generated SDK wrapper, leading to slow fallback or broken inference.
Could you please confirm the requirements for custom Int8 TFLite models, especially regarding the necessary Zero Point and Scale values, or if thereβs a known compatibility issue with EfficientNet Lite on the ESP32-S3 using the EI SDK?