Watching this Object detection tutorial I see the Raspberry Pi 4 manages approx. 500 ms cycle time which is great.
I am experiencing much higher cycle times on my test project which also has 2 classes. Cycle time is approx. 1500 ms. I am also using the Pi 4. Any suggestions on how to improve performance?
By default, the edge-impulse-linux-runnner uses the unoptimized version of your model.
If you want to enable the quantized version, you can use edge-impulse-linux-runnner --quantized.
Feel free to check the help:
$> edge-impulse-linux-runner --help
Usage: edge-impulse-linux-runner [options]
Edge Impulse Linux runner 1.2.5
Options:
-V, --version output the version number
--model-file <file> Specify model file, if not provided the model will be fetched from Edge Impulse
--api-key <key> API key to authenticate with Edge Impulse (overrides current credentials)
--download <file> Just download the model and store it on the file system
--clean Clear credentials
--silent Run in silent mode, don't prompt for credentials
--quantized Download int8 quantized neural networks, rather than the float32 neural networks. These might run
faster on some architectures, but have reduced accuracy.
--enable-camera Always enable the camera. This flag needs to be used to get data from the microphone on some USB
webcams.
--dev List development servers, alternatively you can use the EI_HOST environmental variable to specify the
Edge Impulse instance.
--verbose Enable debug logs
-h, --help output usage information
I hope this will help you get better speed performances. (And I can see on your project that the quantized version has also a better accuracy).
Hi and thanks for these tips I think I managed to upload both models, but I did it by changing the options in Edge Impulse Studio – Deployment – Linux boards – Options. The default selection was ‘Quantized’. Is this not the correct way to select which model to upload to the board?
@robhazes Ah I might have accidentally enabled that on your project when helping you last time, those options under Deployment should be hidden.
Anyway, float32 model is actually faster some times, interestingly enough. On a stock Pi4 that should leave you with the <500ms. inference time on object detection.