Question/Issue:
Is there something I’m missing about quantization option for BYOM?
Project ID:
721531
Context/Use case:
Applying quantization to BYOM
Summary:
I’ve successfully loaded my model (BYOM) in .tflite format and generated the C++ library for use on my MCU. I’ve also confirmed that inference is working by evaluating the results.
Now I’ve tried to use quantization. I’ve loaded ONNX model and provided representative dataset, but I am getting this warnings/errors:
Is it possible that quantization will produce such a memory heavy model or am I doing something wrong?
Steps to Reproduce:
- Load BYOM in SavedModel format
- Check quantization optimization option
Expected Results:
I’ve expected to get quantized model that fits onto MCU (ESP32-S3-N8R8)
Actual Results:
Warnings/error saying that quantized model won’t fit onto MCU
Reproducibility:
- [x] Always
- [ ] Sometimes
- [ ] Rarely
Environment:
- Platform: ESP32-S3-N8R8
- Build Environment Details: IDF-ESP
- OS Version: Windows 11
- Edge Impulse Version (Firmware): 1.72.4
- Edge Impulse CLI Version: Not used
- Project Version: 1.0.0
- Custom Blocks / Impulse Configuration: None