Object Detection Impulse Image Size Consideration

Hello everyone,

I am fairly new to ML and Edge Impulse, and feeling thankful for the foundation they provide here for integrating ML with world of microcontrollers.

I am struggling with the image size settings under “Create Impulse”. I am training various pictures for object detection to be implemented on to Nicla Vision. When I use 96x96 by default, I seem to get a high 90% F1 score with the set of images I provide and labels I defined, but that score drops down significantly, if I were to increase that to, say, 160x160 or 320x320. I was expecting that the score would increase by choosing a higher image size.

I guess I am trying to understand the concept behind these sizes a bit better as well. I am using my phone as the camera for gathering the pictures. From what I see those pictures come in at 512x512 size. I then go ahead and perform the labeling.

Is it the labeling boxes that the impulse resizes to 96x96 or the whole image (512x512)?

If it is the latter case and, say for instance, I had initially provided under data acquisition a rectangular image with a labeling box in the upper right corner; will the impulse then crop that image to a square (depending on the resize mode selected), while also losing the labeled item in the upper corner of that image?

Maybe there are detailed explanation on this somewhere in the documentation that I have missed. Hence, I would appreciate any information or direction possible.

Thank you very much.

yes, this is the size that the image, and corresponding bounding box labels, are rescaled to; so if you capture at 512x512, that’s the resolution you do the labelling, and then everything is scaled down to 96x96 ( or 16x160 etc )

there’s a couple of reason why upping the resolution results in a drop in performance. the main thing is that a higher resolution => effectively a more complex question. this can result in 1) requiring more training data and/or a bigger model as well as 2) the optimisation not running as well (e.g. not converging)

it’s hard to know which though; we’re constantly adding things to try to handle ( and identify ) both of these scenarios.

for most people it’s simpler to increase the model size ( than to get more data ) so i’d recommend that as an experiment; if you increase to 160x160, use a larger model, and train for longer, does that help?

mat

Thank you, Mat.

Makes sense, and I did start varying number of training images, as well as parameters such as learning rate and started achieving better results. It is still a hit and miss, but that’s mostly due to my learning I believe. As I go further here, I hope to get a better feel on how these various knobs affect the overall model behavior and share my findings.

Thanks very much again,
Koray

1 Like