Mismatch of estimated runtime and actual one

Lukas · January 5, 2022, 9:29am

Hi Edge Impulse Team,

I have build a small example (project ID 74857) and the estimated runtime is less than 100 ms. However, when I build it with the C++ lib and the sdk then the runtime is 5.3 seconds. I am not sure what I am doing wrong or where the bottleneck is.

Regards

Lukas

aurel · January 5, 2022, 9:39am

Hi Lukas,

Which device are you deploying to?
The estimated latency depends on the selected target, you can change it in the Dashboard on your project:

Aurelien

Lukas · January 5, 2022, 9:54am

I am using as a first step my desktop CPU (an Intel i7).

aurel · January 5, 2022, 10:23am

I’ve got 1.8sec on my Macbook but the C++ library is only optimized for MCU targets.

If you want full acceleration on your CPU, best is to use our Linux SDK. Your same model runs in less than 100ms on my machine with it.

Aurelien

Lukas · January 5, 2022, 10:49am

I use the Linux SDK https://github.com/edgeimpulse/example-standalone-inferencing-linux

combined with the autogenerated C++ library and I still get 5 seconds. Did you need to adapt something in the project?

janjongboom · January 5, 2022, 12:19pm

@Lukas are you compiling w/ the right flags? Your project had EON enabled on Deployment screen, which does not work w/ Linux SDK, so I think you might not be building the right one?

When compiling with these flags:

APP_CUSTOM=1 TARGET_MAC_X86_64=1 USE_FULL_TFLITE=1 make -j

This gives me 147ms. on my Macbook Pro - still a bit off with the calculations, but much faster.

Note: Use the float32 model! i8 performance is horrendous on x86.

Lukas · January 5, 2022, 3:19pm

@janjongboom yeah that resolved the issue. May I have further question(s): Why is it not compatible with EON compiler? Will this be planned for the future?

janjongboom · January 6, 2022, 9:47am

@Lukas We go with the inferencing engine that makes the most sense on the hardware. For embedded systems that’s EON Compiler w/ TensorFlow Lite Micro kernels, on Linux (without an accelerator) that’s TensorFlow Lite w/ XNNPACK, on Jetson Nano it’s TensorRT, etc. So not needed on Linux systems, TFLite+XNNPACK already gives very good performance, and the overhead of the interpreter is not so bad as on embedded systems.