Hi! I’m trying to run a training job for an Arduino Nano 33 BLE, but the training job is failing with this error. Any ideas what’s going on here?
Creating job... OK (ID: 6666537)
Scheduling job in cluster...
Job started
Scheduling job in cluster...
Container image pulled!
Job started
Splitting data into training and validation sets...
Splitting data into training and validation sets OK
Training model...
Training on 6464 inputs, validating on 1616 inputs
Epoch 1/30
202/202 - 3s - loss: 280.2765 - accuracy: 0.4245 - val_loss: 3.4252 - val_accuracy: 0.8583 - 3s/epoch - 13ms/step
Epoch 2/30
202/202 - 2s - loss: 2.0927 - accuracy: 0.8987 - val_loss: 1.1061 - val_accuracy: 0.9332 - 2s/epoch - 9ms/step
Epoch 3/30
202/202 - 2s - loss: 0.7579 - accuracy: 0.9435 - val_loss: 0.6163 - val_accuracy: 0.9505 - 2s/epoch - 10ms/step
Epoch 4/30
202/202 - 2s - loss: 0.5093 - accuracy: 0.9706 - val_loss: 0.4737 - val_accuracy: 0.9913 - 2s/epoch - 9ms/step
Epoch 5/30
202/202 - 2s - loss: 0.4165 - accuracy: 0.9955 - val_loss: 0.4006 - val_accuracy: 0.9944 - 2s/epoch - 8ms/step
Epoch 6/30
202/202 - 2s - loss: 0.3717 - accuracy: 0.9964 - val_loss: 0.3629 - val_accuracy: 0.9957 - 2s/epoch - 9ms/step
Epoch 7/30
202/202 - 2s - loss: 0.3449 - accuracy: 0.9972 - val_loss: 0.3304 - val_accuracy: 0.9963 - 2s/epoch - 9ms/step
Epoch 8/30
202/202 - 2s - loss: 0.3116 - accuracy: 0.9978 - val_loss: 0.3046 - val_accuracy: 0.9957 - 2s/epoch - 9ms/step
Epoch 9/30
202/202 - 2s - loss: 0.2834 - accuracy: 0.9980 - val_loss: 0.2694 - val_accuracy: 0.9963 - 2s/epoch - 9ms/step
Epoch 10/30
202/202 - 2s - loss: 0.2427 - accuracy: 0.9980 - val_loss: 0.2126 - val_accuracy: 0.9957 - 2s/epoch - 8ms/step
Epoch 11/30
202/202 - 2s - loss: 0.1857 - accuracy: 0.9980 - val_loss: 0.1690 - val_accuracy: 0.9950 - 2s/epoch - 8ms/step
Epoch 12/30
202/202 - 2s - loss: 0.1617 - accuracy: 0.9981 - val_loss: 0.1534 - val_accuracy: 0.9957 - 2s/epoch - 8ms/step
Epoch 13/30
202/202 - 2s - loss: 0.1508 - accuracy: 0.9983 - val_loss: 0.1473 - val_accuracy: 0.9981 - 2s/epoch - 9ms/step
Epoch 14/30
202/202 - 2s - loss: 0.1446 - accuracy: 0.9985 - val_loss: 0.1442 - val_accuracy: 0.9981 - 2s/epoch - 8ms/step
Epoch 15/30
202/202 - 2s - loss: 0.1412 - accuracy: 0.9986 - val_loss: 0.1414 - val_accuracy: 0.9988 - 2s/epoch - 10ms/step
Epoch 16/30
202/202 - 2s - loss: 0.1384 - accuracy: 0.9991 - val_loss: 0.1404 - val_accuracy: 0.9981 - 2s/epoch - 9ms/step
Epoch 17/30
202/202 - 2s - loss: 0.1361 - accuracy: 0.9992 - val_loss: 0.1371 - val_accuracy: 0.9981 - 2s/epoch - 9ms/step
Epoch 18/30
202/202 - 2s - loss: 0.1336 - accuracy: 0.9994 - val_loss: 0.1362 - val_accuracy: 0.9981 - 2s/epoch - 9ms/step
Epoch 19/30
202/202 - 2s - loss: 0.1324 - accuracy: 0.9991 - val_loss: 0.1339 - val_accuracy: 0.9975 - 2s/epoch - 9ms/step
Epoch 20/30
202/202 - 2s - loss: 0.1303 - accuracy: 0.9995 - val_loss: 0.1315 - val_accuracy: 0.9975 - 2s/epoch - 10ms/step
Epoch 21/30
202/202 - 2s - loss: 0.1280 - accuracy: 0.9994 - val_loss: 0.1304 - val_accuracy: 0.9981 - 2s/epoch - 8ms/step
Epoch 22/30
202/202 - 2s - loss: 0.1263 - accuracy: 0.9995 - val_loss: 0.1264 - val_accuracy: 0.9981 - 2s/epoch - 8ms/step
Epoch 23/30
202/202 - 2s - loss: 0.1238 - accuracy: 0.9995 - val_loss: 0.1240 - val_accuracy: 0.9981 - 2s/epoch - 8ms/step
Epoch 24/30
202/202 - 2s - loss: 0.1214 - accuracy: 0.9995 - val_loss: 0.1221 - val_accuracy: 0.9981 - 2s/epoch - 9ms/step
Epoch 25/30
202/202 - 2s - loss: 0.1191 - accuracy: 0.9995 - val_loss: 0.1205 - val_accuracy: 0.9981 - 2s/epoch - 8ms/step
Epoch 26/30
202/202 - 1s - loss: 0.1209 - accuracy: 0.9994 - val_loss: 0.1213 - val_accuracy: 0.9975 - 1s/epoch - 5ms/step
Epoch 27/30
202/202 - 2s - loss: 0.1168 - accuracy: 0.9995 - val_loss: 0.1203 - val_accuracy: 0.9975 - 2s/epoch - 10ms/step
Epoch 28/30
202/202 - 2s - loss: 0.1153 - accuracy: 0.9995 - val_loss: 0.1189 - val_accuracy: 0.9981 - 2s/epoch - 9ms/step
Epoch 29/30
202/202 - 2s - loss: 0.1137 - accuracy: 0.9995 - val_loss: 0.1177 - val_accuracy: 0.9981 - 2s/epoch - 9ms/step
Epoch 30/30
202/202 - 2s - loss: 0.1122 - accuracy: 0.9995 - val_loss: 0.1165 - val_accuracy: 0.9981 - 2s/epoch - 9ms/step
Performing post-training quantization...
Performing post-training quantization OK
Running quantization-aware training...
Epoch 1/30
202/202 - 4s - loss: 0.1465 - accuracy: 0.9827 - val_loss: 0.0762 - val_accuracy: 0.9938 - 4s/epoch - 20ms/step
Epoch 2/30
202/202 - 3s - loss: 0.0710 - accuracy: 0.9980 - val_loss: 0.0619 - val_accuracy: 0.9988 - 3s/epoch - 13ms/step
Epoch 3/30
202/202 - 2s - loss: 0.0642 - accuracy: 0.9986 - val_loss: 0.0610 - val_accuracy: 0.9988 - 2s/epoch - 12ms/step
Epoch 4/30
202/202 - 3s - loss: 0.0615 - accuracy: 0.9983 - val_loss: 0.0580 - val_accuracy: 0.9994 - 3s/epoch - 14ms/step
Epoch 5/30
202/202 - 3s - loss: 0.0570 - accuracy: 0.9988 - val_loss: 0.0531 - val_accuracy: 0.9994 - 3s/epoch - 13ms/step
Epoch 6/30
202/202 - 3s - loss: 0.0531 - accuracy: 0.9988 - val_loss: 0.0498 - val_accuracy: 0.9994 - 3s/epoch - 13ms/step
Epoch 7/30
202/202 - 3s - loss: 0.0506 - accuracy: 0.9992 - val_loss: 0.0481 - val_accuracy: 0.9994 - 3s/epoch - 14ms/step
Epoch 8/30
202/202 - 2s - loss: 0.0480 - accuracy: 0.9994 - val_loss: 0.0467 - val_accuracy: 0.9994 - 2s/epoch - 12ms/step
Epoch 9/30
202/202 - 2s - loss: 0.0458 - accuracy: 0.9994 - val_loss: 0.0453 - val_accuracy: 0.9988 - 2s/epoch - 12ms/step
Epoch 10/30
202/202 - 3s - loss: 0.0456 - accuracy: 0.9988 - val_loss: 0.0451 - val_accuracy: 0.9994 - 3s/epoch - 13ms/step
Epoch 11/30
202/202 - 2s - loss: 0.0447 - accuracy: 0.9988 - val_loss: 0.0424 - val_accuracy: 0.9988 - 2s/epoch - 12ms/step
Epoch 12/30
202/202 - 2s - loss: 0.0409 - accuracy: 0.9995 - val_loss: 0.0406 - val_accuracy: 0.9988 - 2s/epoch - 12ms/step
Epoch 13/30
202/202 - 1s - loss: 0.0400 - accuracy: 0.9992 - val_loss: 0.0420 - val_accuracy: 0.9988 - 1s/epoch - 6ms/step
Epoch 14/30
Restoring model weights from the end of the best epoch: 4.
202/202 - 3s - loss: 0.0395 - accuracy: 0.9997 - val_loss: 0.0396 - val_accuracy: 0.9994 - 3s/epoch - 12ms/step
Epoch 00014: early stopping
Running quantization-aware training OK
Finished training
Saving best performing model...
Saving best performing model OK
Converting TensorFlow Lite float32 model...
Converting TensorFlow Lite int8 quantized model...
Converting to Akida model...
Model Summary
______________________________________________
Input shape Output shape Sequences Layers
==============================================
[1, 1, 21] [1, 1, 4] 2 4
______________________________________________
SW/dense (Software)
______________________________________________
Layer (type) Output shape Kernel shape
==============================================
dense (Fully.) [1, 1, 21] (1, 1, 21, 21)
______________________________________________
HW/dense_1-y_pred (Hardware) - size: 1432 bytes
_____________________________________________________
Layer (type) Output shape Kernel shape NPs
=====================================================
dense_1 (Fully.) [1, 1, 10] (1, 1, 21, 10) 1
_____________________________________________________
y_pred (Fully.) [1, 1, 4] (1, 1, 10, 4) 1
_____________________________________________________
Converting to Akida model OK
Saving Akida model...
Saving Akida model OK...
Loading data for profiling...
Loading data for profiling OK
Creating embeddings...
[ 0/8080] Creating embeddings...
[7200/8080] Creating embeddings...
[8080/8080] Creating embeddings...
WARN: More than 5000 samples, using PCA to create embeddings.
Creating embeddings OK (took 4 seconds)
Calculating performance metrics...
Calculating inferencing time...
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
ERROR: failed to create XNNPACK runtime
ERROR: Node number 22 (TfLiteXNNPackDelegate) failed to prepare.
ERROR: Restored original execution plan after delegate application failure.
ERROR: Error in applying the default TensorFlow Lite delegate indexed at 0, and all previously applied delegates are reverted.
Calculating inferencing time OK
Profiling float32 model...
Profiling float32 model (tflite)...
Profiling float32 model (EON)...
Profiling akida model...
WARNING: Requested model can't be fully mapped to hardware. Reason:
WARNING: Error when mapping layer 'dense': Weight bits should be between 1 and 4 inclusive.
WARNING: Reported program size, number of NPs and nodes may not be accurate!
Traceback (most recent call last):
File "/home/profile.py", line 330, in <module>
main_function()
File "/home/profile.py", line 308, in main_function
metadata = ei_tensorflow.profiling.get_model_metadata(model, validation_dataset, Y_test, samples_dataset, Y_samples, has_samples,
File "/app/./resources/libraries/ei_tensorflow/profiling.py", line 758, in get_model_metadata
akida_perf = profile_model(model_type, None, None, validation_dataset, Y_test, X_samples,
File "/app/./resources/libraries/ei_tensorflow/profiling.py", line 290, in profile_model
prediction, prediction_train, prediction_test = make_predictions(mode, model, validation_dataset, Y_test,
File "/app/./resources/libraries/ei_tensorflow/profiling.py", line 249, in make_predictions
return ei_tensorflow.brainchip.model.make_predictions(mode, akida_model_path, validation_dataset,
File "/app/./resources/libraries/ei_tensorflow/brainchip/model.py", line 53, in make_predictions
prediction = predict(model_path, validation_dataset, len(Y_test))
File "/app/./resources/libraries/ei_tensorflow/brainchip/model.py", line 95, in predict
output = model.predict(item)
RuntimeError: The maximum input event value is 15, got 255
Application exited with code 1
Job failed (see above)