Question/Issue:
Inference result is always fixed to the same class (“2”) with confidence ~0.996, regardless of what the camera sees. Additionally, when attempting to fix this issue by modifying the image capture buffer handling, a CORRUPT HEAP crash occurs after the first inference attempt, causing an infinite reboot loop.
Project ID: 983735
Context/Use case:
Building a real-time hand gesture recognition system using an ESP32-S3 with an OV3660 camera module. The camera captures JPEG frames, converts them to RGB888 via fmt2rgb888, resizes from 320×240 to 96×96 using Edge Impulse’s crop_and_interpolate_rgb888, then feeds grayscale float values to the classifier via getSignalData callback. A web server streams the live camera feed simultaneously.
Steps Taken:
-
Captured JPEG frame from camera (QVGA 320×240), converted to RGB888 using
fmt2rgb888, resized to 96×96 usingcrop_and_interpolate_rgb888, then passed grayscale float values torun_classifierviagetSignalDatacallback. -
Confirmed camera buffer values change every inference cycle (capture itself is working):
buf[0]=167, buf[1]=168, buf[2]=165
buf[0]=108, buf[1]=114, buf[2]=113
buf[0]=216, buf[1]=222, buf[2]=228
-
Fixed index calculation bug in
getSignalData(changedoffset*3 + i*3→(offset+i)*3). -
Reduced
snapshot_bufallocation from320×240×3to96×96×3since only resized data needs to be stored. -
Moved
fmt2rgb888target buffer from internal SRAM to PSRAM (MALLOC_CAP_SPIRAM) due to insufficient internal SRAM for a 230,400-byte buffer. -
Applied standard grayscale conversion formula:
0.299*r + 0.587*g + 0.114*b. -
Tried normalizing output to
0.0~1.0range (gray / 255.0f).
Despite all of the above, inference result remains fixed at class “2” with confidence 0.996. Some modifications additionally triggered a CORRUPT HEAP crash on first inference.
Expected Outcome:
The classifier should return different labels with varying confidence scores depending on the hand gesture shown to the camera, matching the behavior observed when testing the model directly on Edge Impulse’s platform using a smartphone.
Actual Outcome:
Issue 1 — Fixed inference result:
Result: 2 (0.996)
Result: 2 (0.996)
Result: 2 (0.996)
The classifier always outputs class “2” at ~0.996 confidence regardless of input.
Issue 2 — CORRUPT HEAP crash (occurs after buffer handling modifications):
PSRAM remaining: 8312355 bytes
Internal heap: 256584 bytes
CORRUPT HEAP: Bad head at 0x3d813134. Expected 0xabba1234 got 0x68656868
assert failed: multi_heap_free multi_heap_poisoning.c:259 (head != NULL)
The device crashes on first inference and reboots infinitely.
Reproducibility:
- [x] Always
Environment:
- Platform: ESP32-S3 DevKitC-1 (8MB Flash, 8MB PSRAM), OV3660 camera module
-
Build Environment Details: PlatformIO + Arduino framework,
board_build.arduino.memory_type = qio_opi,EI_CLASSIFIER_TFLITE_ARENA_SIZE = 2097152 - OS Version: Windows 10
-
Edge Impulse Version (Firmware):
EI_STUDIO_VERSION_MAJOR = 1EI_STUDIO_VERSION_MINOR = 93EI_STUDIO_VERSION_PATCH = 3
- Edge Impulse CLI Version: N/A (library deployment)
- Project Version: Deploy version 16
-
Custom Blocks / Impulse Configuration:
- Input: Image 96×96, Grayscale, Resize mode: Fit shortest
- DSP Block: Image (Grayscale)
- Learning Block: Transfer Learning, MobileNetV2 96×96 0.35
- Output: 6 classes (1, 2, 3, background, good, heart)
- Inferencing engine: EON Compiler (INT8 quantized,
EI_CLASSIFIER_COMPILED = 1) EI_CLASSIFIER_TFLITE_INPUT_DATATYPE = INT8EI_CLASSIFIER_LOAD_IMAGE_SCALING = 0EI_CLASSIFIER_TFLITE_LARGEST_ARENA_SIZE = 256326
Logs/Attachments:
Relevant code — capture and inference:
bool captureImageForInference(uint32_t img_width, uint32_t img_height, uint8_t *out_buf) {
camera_fb_t *fb = esp_camera_fb_get();
uint8_t *tmp_buf = (uint8_t*)heap_caps_malloc(
320 * 240 * 3, MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT);
fmt2rgb888(fb->buf, fb->len, PIXFORMAT_JPEG, tmp_buf);
esp_camera_fb_return(fb);
ei::image::processing::crop_and_interpolate_rgb888(
tmp_buf, 320, 240, out_buf, img_width, img_height);
free(tmp_buf);
return true;
}
static int getSignalData(size_t offset, size_t length, float *out_ptr) {
for (size_t i = 0; i < length; i++) {
size_t pixel_ix = (offset + i) * 3;
uint8_t r = snapshot_buf[pixel_ix];
uint8_t g = snapshot_buf[pixel_ix + 1];
uint8_t b = snapshot_buf[pixel_ix + 2];
out_ptr[i] = 0.299f * r + 0.587f * g + 0.114f * b;
}
return 0;
}
signal_t signal;
signal.total_length = 96 * 96; // EI_CLASSIFIER_RAW_SAMPLE_COUNT = 9216
signal.get_data = &getSignalData;
run_classifier(&signal, &result, false);
Additional Information:
- The model works correctly when tested directly on Edge Impulse’s platform using a smartphone camera — all 6 classes are recognized properly.
- The live camera stream on the web UI updates correctly, confirming the camera hardware and capture pipeline are functional.
- We suspect one of the following may be the root cause: (1)
fmt2rgb888writing beyond the allocated PSRAM buffer boundary, (2) incorrect data format or value range expected by the EON-compiled INT8 model ingetSignalData, or (3) PSRAM cache coherency issues when readingsnapshot_bufinside the callback. - We are considering switching from
PIXFORMAT_JPEGtoPIXFORMAT_RGB888to eliminatefmt2rgb888entirely — would this approach be recommended?