Image classification results drastically varies in deployment

absoluteabutaj_proto · February 28, 2023, 4:52am

Question/Issue:

Whenever the image gets captured and passed on for the inference, the results of the edge inferenced image drastically varies when running the same image for inference on the web.

My deployment target is ESP32-CAMERA (Ai-Thinker) which is running FreeRTOS (ESP-IDF).

I’ve followed this example from EI: GitHub - edgeimpulse/firmware-espressif-esp32: Edge Impulse firmware for the Espressif ESP-EYE(ESP32) Development board

Did some additional modification to store the captured image to SDCard so that I can upload it to web for inferencing.

The results varies drastically for every single images.

What I’m missing?

Image Examples: Captured Image

2502_35

Predicted Result on the device:
Predictions (DSP: 10 ms., Classification: 152 ms., Anomaly: 0 ms.):
AsianElephant: 0.02734
Human: 0.67188
Random: 0.29688

Live classifcation result is in the following thread

Note: If more images are required, I’m happy to supply.

Project ID: 134604

Really looking forward to get it clarified.

Thanks in advance

-Abu

absoluteabutaj_proto · February 28, 2023, 4:53am

Here is the live classification result from web

absoluteabutaj_proto · February 28, 2023, 4:59am

ESP32-CAMERA classifed image
pd_ZA_R01_1_27-Feb-2023_09;59;21_0.00,1.00,0.00

Result
AsianElephant: 0.00
Human: 1.00
Random: 0.00

absoluteabutaj_proto · February 28, 2023, 5:00am

Result for the same image on web live classification

absoluteabutaj_proto · March 1, 2023, 6:05am

Some more images of the classification.

Device predicted result
AsianElephant: 0.00
Human: 0.55
Random: 0.45

louis · March 1, 2023, 8:39am

Hello @absoluteabutaj_proto,

Could you try to run the standalone example: On your Espressif ESP-EYE (ESP32) development board - Edge Impulse Documentation (both the float32 and the quantized model).
The results of the float32 should match the live classification.

If the results are the same for the float32 model but not for the quantized model, you can read this section: https://docs.edgeimpulse.com/docs/tips-and-tricks/increasing-model-performance#large-difference-between-quantized-int8-and-unoptimized-float32-model-performances

If they are the same, it probably come from your camera. Are the data used for the training been collected with the camera on your device?

Best,

Louis

absoluteabutaj_proto · March 1, 2023, 9:15am

Thanks for your response @louis

The images used to train the model are not only from the ESP32-CAMERA, but also from some other camera sources like the mobile phones, and open images.

Does it really matter to train the model with the target devices’ camera?

delfin4 · March 1, 2023, 2:28pm

Also, a lot of the images are from trail cameras that are deployed with the devices.

delfin4 · March 3, 2023, 4:37am

Does it really matter to train the model with the target devices’ camera?

Haven’t seen a response to this yet and it wouldn’t make sense that the images need to come from the camera only as this would imply that using images from the extensive collection of images available from different sources are worthless.

louis · March 3, 2023, 8:56am

Hello @delfin4, @absoluteabutaj_proto

It should not matter much if you have a good dataset.
If you have a close look at this image:

You can notice a small line on the upper part of the image. If you have data samples in your dataset of human containing that small line, your NN will likely learn that feature.

Same apply for an image that is too bright or too dark, etc… Those are features a NN can learn.

Best,

Louis