FOMO alpha 0.05

peterpis · September 3, 2023, 4:09pm

Hello,

I noticed that EdgeImpulse has two MobileNetV2 architecture settings, the one with weights for alpha 0.1 and for alpha 0.35. However in the documentation it is stated that it is possible to get the models with alpha 0.05 aswell. It is stated in FOMO documentation that the smallest version of FOMO (96x96 grayscale input, MobileNetV2 0.05 alpha) runs in <100KB RAM, however when I try to set the alpha in expert mode to 0.05 and train the model, I get the exact same inference time, peak ram usage and flash usage as with alpha 0.1. Did I set the alpha incorrectly? The weights I used are from MobileNetV2 arcitecture:

WEIGHTS_PATH = ‘./transfer-learning-weights/edgeimpulse/MobileNetV2.0_05.96x96.grayscale.bsize_64.lr_0_05.epoch_334.val_loss_4.53.hdf5’

Thank you in advance!

matkelcey · September 5, 2023, 11:01pm

Hi @peterpis

Great point! TL;DR is that alpha=0.1 and 0.05 give the same result, I’ll explain why…

The alpha parameter controls the channel depth across the MobileNet, the higher the alpha, the more channels each block has.

Additionally we don’t use the entire MobileNet for FOMO, we instead cut near the top so we’re only using part of it (see the cut point section of the docs ). By default we cut at block_6_expand_relu

If compare the shapes of the feature maps at the start of the MobileNet, at this cut point, and the end of the MobileNet (block_16_project_BN) for different alphas we see the following…

alpha | input         | at block_6   |  at block_16
------+---------------+--------------+---------------
0.35  | (320, 320, 3) | (40, 40, 96) | (10, 10, 112) 
0.1   | (320, 320, 3) | (40, 40, 48) | (10, 10, 32)
0.05  | (320, 320, 3) | (40, 40, 48) | (10, 10, 16)

So even though the full MobileNet alpha=0.05 is smaller it turns out at the cut point we take it the truncated network is the same for alpha=0.1 or 0.05

Sorry for the confusion, we’ll update the docs to describe this and just reference 0.1
Mat

peterpis · September 13, 2023, 9:13pm

Hello,

thank for your clarification of the architecture. However, I still don’t understand why I get model size of around 235 kB RAM when using MobileNetV2 with multiplier 0.1 for cortex M4, while in the documentation RAM less than 100 kB with 10 fps inference is stated. Is there another way that was used to reduce the model size?

pet