Resolution of training data vs runtime data

Question/Issue:
I have trained a model using our 1200x1200 images which has a just about reasonable Precision and Recall. It seems to work when tested on the website. However when deployed to an ESP32S3 taking pics at 240x240 it fails to detect anything. What should I change? Is this because the training data is a different resolution from the runtime data?

Project ID:
buzzcopper1

Context/Use case:

Steps Taken:
Image size was set as 240 x 240 in the Image data block of the Impulse.
All training images (1200x1200) were loaded in from a Google Cloud bucket.

Expected Outcome:
When presented with an image containing an Asian Hornet I would expect an inference.

Actual Outcome:
All prediction%s are 0

Reproducibility:

  • [ ] Always

Environment:
ESP32S3

Hello @buzzcopper

do you think you can get an example of the images taken by the ESP32? so you can compare with the ones you used to train the model?

Sometimes it’s not about the resize but about the light or other conditions.

Please try to share examples, so we can try to help you more.

All the images used for training were taken by ESP32 cameras (OV3660) at 1200x1200 in exactly the situation where the inferencing will need to run. I don’t think that is likely to be the problem. It’s not that inferencing doesn’t work most of the time. It doesn’t ever work on my device. The model works well enough in the browser so I think the model is ok.

Initially I tried to inference using images taken at 240x240 but that didn’t work. Then I changed the parameters on the Image processing block to receive 96x96 images, built and downloaded a new model and supplied camera images after resizing using ei::image::processing::crop_and_interpolate_rgb888().
That didn’t work.

I have taken one of the training data images and resized that to 96x96 and stored in snapshot_buf. That didn’t work.

I have also tried to simply pass in the hexadecimal Raw Features list copied from one of the training data images. I assumed each of the hexadecimal Feature values is an RGB triplet which I then used to fill the snapshot_buf[] instead of the camera. However, that doesn’t work either.

I feel sure there is a simple thing I’ve got wrong.

As I’m using an ESP32S3 I think 240x240 images should work and that the difference between my training data image resolution and the resolution of my inference targets is probably not the problem now.

I have found the problem.

The Arduino library example sketch esp32/esp32_camera uses EI_CLASSIFIER_OBJECT_DETECTION to control whether bounding boxes are displayed or not. I set it to 0 at some point to suppress bounding box printf chatter. I was unaware that it effectively turns off object detection entirely.

1 Like

Thanks for sharing the issue @buzzcopper

Glad that the project is working properly now!

Would you mind to share more about the project? I really love this type of projects :slight_smile:

Sure. There is an outline of the project at https://buzzcopper.org

The problem we are addressing is the attempted eradication of the Asian Hornet (Vespa Velutina) from the UK and potentially Europe too. Maybe US too if all goes well.

We are trying to produce large numbers of really cheap devices based around the ESP32S3 so we can both supply individuals who want to help out by putting one in their garden and hang them up in remote woodland etc where Asian Hornets may already have established a foothold.

I started off two years ago using Edge Impulse to build a sound recognition AI model. This worked reasonably well on the bench but there were too many false positives when testing outside. I dropped the sound recognition as only images will trigger the UK authorities to act.

For some while I tried to capture images, upload them to Google Cloud Storage and get them classified by Google Vision but that was not cost-effective or accurate. So now I am building a new image model using Edge Impulse once again to allow object detection on the device.

This is a not-for-profit project.

1 Like

This is a fantastic project @buzzcopper

Looking forward to read the instructions to build it myself and test it in my place in Barcelona!

I need to get the AI model working properly before I can open it up to wider usage as currently it costs me gbp0.01 per image screened through Google vision.
If you can help with that we can send you a kit of parts and instructions to make one.

I understand there are Asian hornets in Barcelona so you’re in a good place to help with field testing

Regards
Jon

Thanks @buzzcopper I have a question! why do you use Google Vision?

And sadly there are many Asian Hornets in Barcelona :frowning:

I used Googlevision initially because it seemed to work quite well at identifying Asian hornets from the pictures I could find of Asian hornets on the Web. But that was all part of its training data so it would. I couldn’t train a model without training data and I had none.

So the firmware that we have been field testing has used simple machine vision processing on the device to spot new insect shaped blobs and then Google vision to try and identify them (usually wrongly) but all images were saved to a Google cloud storage bucket. We now have hundreds of training data images so we can build a model and cut out Google vision.