Quantized model is underperforming

I built out my classification model and the float 32 performs perfectly. However the quantized model is only at about 93.8% and is misclassifying some cases that will be very bad for the end project. Ultimately I’m going to need the results of this to be available for a python program in order to drive a robot.

I have the following devices that could run this: OpenMV H7 R2 (not plus), ras pi, laptop on Windows. I’m open to deploying to any of those, pending my ability to use the results in a python script. Any changes in model architecture, is fine with me.

Project ID:

Context/Use case:
I apologize if these are dumb questions, but I will need to get data from the OpenMV, classify it and then use that classification in a python script. I’m open to using any of the machines listed, changing model parameters etc.

Hello @ShasVre,

Quantization works by reducing the precision of the model’s weights, so there will often be a bit of reduction in performance. If it performs too poorly, here are some things that can help:

  • Add more data. It is especially likely that performance will be lost for classes that are in a minority in the training set. If you are working with time series (such as accelerometers), try reducing the window increase, it will generate more samples.

  • Add some regularization (for example, some dropout layers). This helps force the network to be more resilient against the kind of error introduced by quantization. You may have to train a few more epochs to make up for the regularization.

  • Increase the capacity of the network. Quantization error is worse for networks that are at their capacity limit, e.g. where every parameter is super important. If you add a few more neurons or layers you might find that the issue is not as bad.