Hi
I’d like to know the differences between the float32 and int8 versions of a model that can be deployed. Besides the four options (Latency\Ram\Flash\Accuracy) displayed on the deployment interface, are there any other differences for edge devices?
For example, when deploying with float32, will a Cortex-M4F core be more efficient than a Cortex-M4 core?
Also, does running the float32 version of a model on an edge device involve the FPU?
Will the float32 version be less efficient on devices without an FPU?
I only found the following description in the documentation:
Float32 vs int8 models
You can choose to test your model using either the float32 or int8 quantized version. The float32 version offers higher precision but may use more resources, while the int8 quantized version is optimized for memory and computational efficiency, making it suitable for edge devices with limited resources.
If you have any ideas please let me know, thanks!