I’ve trained a small classifier model (project id: 563549). For deployment, I want to run this on ESP32S3. So, I exported the Arduino library and tried running the static_buffer.ino example file through Arduino & Platformio both. I’ve been getting this error:
ERR: Failed to allocate persistent buffer of size 512, does not fit in tensor arena and reached EI_MAX_OVERFLOW_BUFFER_COUNT
Guru Meditation Error: Core 1 panic'ed (StoreProhibited). Exception was unhandled.
My esp32 board has an 8mb psram and was wondering if the internal SDK is using the PSRAM or not? If not by default, how can I make use of it such that the model runs on my Edge device?
Do you have it enabled for your project?
Also how did you “trying further and moving past that error, tried running the same code”? If you increased the arena size and running into ERR: failed to allocate tensor arena it maybe possible that you don’t have PSRAM enabled for the project…
I think yes the code is using PSRAM automatically as I tried printing the heap and PSRAM consumption. I made some changes here and there in the configurations and reduced the model input size from a 96x96 image to a 48x48 image. Still the same ‘Failed to run classifier’ error. Please find below my project configurations (platformio.ini):
; PlatformIO Project Configuration File
;
; Build options: build flags, source filter
; Upload options: custom upload port, speed and extra flags
; Library options: dependencies, extra library storages
; Advanced options: extra scripting
;
; Please visit documentation for the other options and examples
; https://docs.platformio.org/page/projectconf.html
[env:esp32-s3-devkitc-1]
platform = espressif32
board = esp32-s3-devkitc-1
framework = arduino
board_build.mcu = esp32s3
; board_build.flash_mode = dio ; tried these as well
; board_build.memory_type= dio_opi ; tried these as well
board_build.flash_mode = qio
board_build.memory_type= qio_opi
board_build.partitions = min_spiffs.csv
board_build.f_flash = 80000000L
board_build.f_cpu = 240000000L
board_upload.maximum_ram_size = 8388608
board_upload.flash_size: 16MB
board_upload.maximum_size = 16777216
build_unflags = -std=gnu++11
build_flags =
-std=gnu++17
-DBOARD_HAS_PSRAM
-DARDUINO_ESP32S3_DEV
-DARDUINO_USB_CDC_ON_BOOT=1
lib_deps =
.\ei-bay-occupancy-arduino-1.0.6.zip
upload_speed = 115200
monitor_speed = 115200
Should I reduce the input size even more and try it again? Or are there any other changes that I can make? Also, I did not increase the arena size and would really appreciate your help in increasing the default allocation to the tensor arena; this may solve the problem; who knows?
@AIWintermuteAI Such a weird thing I found out while experimenting:
The project even with a 96x96 input size worked with an esp32cam board (AI Thinker) but is not working with other boards that have esp32-s3 chips. Is it something related to the ESP32-S3 chip?
No, 96x96 should run with no issues on S3.
Can you try manually increasing the model arena size? It can be found model_variables.h. Try adding a few hundred kb, if you have PSRAM this much should not be an issue.
You are correct; increasing a few KBs made it run on the S3 too. But I wonder why this is happening. But now it’s working I don’t mind setting a bit higher manually for the S3. Thanks for the help.
Now, one blunder that I’m noticing now, which has been boggling my mind since yesterday, is that each time I run the example, even with different inputs, I get exactly the same result. I have a binary classifier, and almost every time, I get the exact same value of the impulse, predicting it as one class with 98% confidence. In the studio, it looks like it is working just fine. For your reference, I’m using a binary classifier (greyscale input) quantized model with an EON compiler. I’m using the static_buffer example.
Somewhere, I was reading on the forum regarding using a function run_classifier_image_quantized instead of run_classifier when using a quantized model. Could this might be the case? I tried running the former function but couldn’t seem to make it work because of the first argument in the function (impulse), which I could not access in my main file.
Project id: 563549.
Any help/suggestions are much appreciated. Many thanks!
Static_buffer runs inference on static data, so you SHOULD be getting the same output? You need to run the camera example to get the image from camera.
re: increasing arena size made it work.
So it is related to the fact that ESP32-S3 uses optimized kernels, which have different arena size requirements due to scratch buffers used in those kernels. We benchmark for regular kernels or CMSIS NN kernels, so we don’t know the exact arena size needed. We do leave some wiggle space, but apparently in that case it was not enough.
Thanks for responding. Sorry for the confusion but I meant even when changing the static input inside the static_buffer.ino example, it gives out the same result. It gets the input from the ‘features’ variable that has the processed feature flattened array right? So even if I paste another array there and flash the firmware again, the response is identical, always.
It gets the input from the ‘features’ variable that has the processed feature flattened array right?
Yes, this is correct.
So even if I paste another array there and flash the firmware again, the response is identical, always.
The important thing here is if it matches the classification results from Studio. If it matches, the device inference works correctly and the issue is with the model. Do the results match?
Did you try solution detailed above? Also pease use our docs for reference or highlighting bugfixes if you can, we dont have control over the SEEED ones and they may be out of sync →
Can you try manually increasing the model arena size? It can be found model_variables.h. Try adding a few hundred kb, if you have PSRAM this much should not be an issue.
@garvit185 ,
I tested your project on T-Camera S3 and could not reproduce the results mismatch. After tweaking the arena size, the results from Studio match the results on the device:
I found what I was doing wrong. In the input features instead of copying raw features I was putting in the processed feature array. It happened because I got confused with the nomenclature, where the input type was float and the raw features were hexadecimal. I thought the input should be the processed feature array as they were of type float.
Once I tried to do it correctly, the IDE showed me errors when I added the raw features.
But thanks to you, from your screenshot, I could figure out what I was doing wrong. Also the other issue is solved as well after increasing the tensor arena size in the esp32-s3 chip.
Thanks guys so much for your support and your patience.
If it is of any help, I would suggest pasting the processed features instead of raw features should throw an error.