Custom block failing on int8 quantizaiton

Question/Issue:
I’ve built a custom pytorch impulse block starting from the repo here. I’ve gotten to the point that I can run it on Edge Impulse, the training completes, the onnx conversion completes, and tf lite 32bit conversion completes, all of which are confirmed to produce the same scores as the original PyTorch model. However, the tf lite 8bit crashes. This works properly on my local machine so I am struggling to debug what happens when running in the Edge Impulse environment. It also seems to work properly with dense layers but only fails with conv2d layers.

Project ID:
823287

Context/Use case:
Described above with details below. Running training on the Dendritic NN Block with default settings.

Steps Taken:

  1. Significant debugging locally, but unable to sort out the solution to the errors below.

Expected Outcome:
int8 TFLite converstion/quantization executing properly.

Actual Outcome:
int8 TFLite conversion crashes.

Reproducibility:

  • [ ] Always
  • [ x ] Sometimes - it happens most of the time. In rare cases where 0 dendrites get added the crash will not occur.
  • [ ] Rarely

Environment:

Logs/Attachments:
Testing int8 TFLite model (model_quantized_int8_io.tflite)…

✗ TENSOR ALLOCATION FAILED for /home/model_quantized_int8_io.tflite
Error: failed to create XNNPACK runtimeNode number 14 (TfLiteXNNPackDelegate) failed to prepare.

✗ INT8 QUANTIZATION OR TESTING FAILED
Error: failed to create XNNPACK runtimeNode number 14 (TfLiteXNNPackDelegate) failed to prepare.

Traceback (most recent call last):
File “/app/train.py”, line 1274, in main
full_test_acc_int8 = test_tflite(tflite_int8_path, full_test_loader, “INT8 TFLite”)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/app/train.py”, line 1138, in test_tflite
interpreter.allocate_tensors()
File “/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/interpreter.py”, line 554, in allocate_tensors
return self._interpreter.AllocateTensors()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: failed to create XNNPACK runtimeNode number 14 (TfLiteXNNPackDelegate) failed to prepare.
Tensor 22 (model/tf.math.multiply_2/Mul): scale=[7.84313680668447e-09], zero_point=[0]
Tensor 23 (model/tf.linalg.matmul_1/MatMul;model/tf.operators.add_1/AddV2): scale=[0.08592402189970016], zero_point=[-3]

Testing int8 TFLite model (model_quantized_int8_io.tflite)…

✗ INT8 QUANTIZATION OR TESTING FAILED
Error: failed to create XNNPACK runtimeNode number 14 (TfLiteXNNPackDelegate) failed to prepare.

                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File “/app/train.py”, line 1138, in test_tflite
interpreter.allocate_tensors()
File “/opt/conda/lib/python3.11/site-packages/tensorflow/lite/python/interpreter.py”, line 554, in allocate_tensors
return self._interpreter.AllocateTensors()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: failed to create XNNPACK runtimeNode number 14 (TfLiteXNNPackDelegate) failed to prepare.

Additional Information:
Thank you!