Latency/Inference speeds

Hello,
I was creating and running some models and then comparing inference speeds on different boards to make a selection.
I was wondering how accurate are these latency speeds information provided?
As well as prepocessing speed accuracy?
Have they been compared with actual board results as well?

Thank you for your great platform.

Hi @Eldorado774,

We use simulation tools to calculate latency performance and memory usage while creating a model. Next to that, we run benchmark models on all of the devices regularly to verify.

Note that these calculations only apply on the model and doesn’t take into account the latency caused by capturing the data and the memory needed to store / run the rest of the application. So not a 100% but it is a good estimate and comparison between devices.

Hello @Arjan ,
Thank you and yes I do understand that its not an end-to-end inference speed but does provide a good benchmark between diffferent devices as a reference.
Just one more question, for models like FOMO (from the tutorial example from edgeImpulse docx), I find that the latency performance is shown for Cortex-A devices such as RPi, Nvidia Jetson but not for Cortex-C devices, such as arduino nicla etc.
Is there a plan to support latency predictions for bigger models on cortex C devices such as FOMO?

Most models from our FOMO benchmark model set don’t fit on Cortex-M devices. So we can’t get accurate results. Although we could consider creating a smaller FOMO subset for Cortex-M devices.

1 Like

I understand, it does fit for the higher end cortex M devices, such as M33 core,M4, and M7 cores which is of interest.
Thank you for your answers :slight_smile: