FOMO input image resolution

Hi everyone,

I am currently developing a person detection model using FOMO. I read that the default input size for images is 96*96 by default but can it be any size as long as it is square? How could I change the input size for images and how does it affect the predictions?

Thank you all in advance.

Hi,
i’m playing with a “birds detection” model to find out how which pic size, color model, and so on to use.
xJust made a few tests only till now, but what i can say:
The pics you load up to create the model have not to be squere, it gets cuted at during the training process.
For better accuracy you can select a different square size. For my “select a bird” model i try’d with 96x96 and 128x128, both grayscale and RGB.
Not surprising, the result of the 128x128 RGB is the best with about 93% accuracy, but, also not surprising, the time to process at the target device ( ESP32 eye ) is shown with 2.6 secounds.
The birds at the pic are usualy small, my next try will be to use for training pics where the models are bigger and keep the pics for testing like they are now.