Post training compilation error

naveen · July 14, 2021, 12:52am

I am seeing error below after finishing the training.

Creating job… OK (ID: 1085089)

Copying features from processing blocks…
Copying features from DSP block…
Copying features from DSP block OK
Copying features from processing blocks OK

Job started
Splitting data into training and validation sets…
Splitting data into training and validation sets OK

Training model…
Training on 9108 inputs, validating on 2278 inputs
99 65 1 6435
Epoch 1/5
Epoch 7% done
Epoch 16% done
Epoch 26% done
Epoch 35% done
Epoch 44% done
Epoch 53% done
Epoch 62% done
Epoch 71% done
Epoch 80% done
Epoch 90% done
Epoch 99% done
143/143 - 121s - loss: 0.6768 - accuracy: 0.6263 - val_loss: 0.5724 - val_accuracy: 0.7265
Epoch 2/5
Epoch 8% done
Epoch 17% done
Epoch 26% done
Epoch 35% done
Epoch 45% done
Epoch 54% done
Epoch 63% done
Epoch 72% done
Epoch 81% done
Epoch 90% done
143/143 - 122s - loss: 0.5886 - accuracy: 0.6890 - val_loss: 0.5604 - val_accuracy: 0.6168
Epoch 3/5
Epoch 8% done
Epoch 17% done
Epoch 26% done
Epoch 35% done
Epoch 45% done
Epoch 54% done
Epoch 62% done
Epoch 71% done
Epoch 80% done
Epoch 89% done
Epoch 98% done
143/143 - 124s - loss: 0.5555 - accuracy: 0.7121 - val_loss: 0.8795 - val_accuracy: 0.5821
Epoch 4/5
Epoch 7% done
Epoch 16% done
Epoch 26% done
Epoch 35% done
Epoch 44% done
Epoch 53% done
Epoch 62% done
Epoch 71% done
Epoch 80% done
Epoch 90% done
Epoch 99% done
143/143 - 124s - loss: 0.5485 - accuracy: 0.7262 - val_loss: 0.4723 - val_accuracy: 0.7726
Epoch 5/5
Epoch 8% done
Epoch 16% done
Epoch 26% done
Epoch 35% done
Epoch 44% done
Epoch 52% done
Epoch 61% done
Epoch 70% done
Epoch 79% done
Epoch 88% done
Epoch 97% done
143/143 - 124s - loss: 0.5065 - accuracy: 0.7608 - val_loss: 0.4662 - val_accuracy: 0.7941
Finished training

Saving best performing model…
Converting TensorFlow Lite float32 model…
Converting TensorFlow Lite int8 quantized model with float32 input and output…
Converting TensorFlow Lite int8 quantized model with int8 input and output…
Calculating performance metrics…
Profiling float32 model…
Profiling 73% done
Profiling 81% done
Profiling float32 model (tflite)…
Traceback (most recent call last):
File “/home/tflite_find_operators.py”, line 90, in
raise Exception(‘TensorFlow Lite op “{}” has no known MicroMutableOpResolver method.’ % (key))
TypeError: not all arguments converted during string formatting
Failed to compile application (255) Building project benchmark (DISCO_F746NG, GCC_ARM)
Scan: benchmark
Compile [ 98.9%]: arm_abs_f16.c
Compile [ 99.0%]: arm_add_f16.c
Compile [ 99.2%]: arm_abs_q31.c
Compile [ 99.3%]: arm_abs_f32.c
Compile [ 99.5%]: arm_add_f32.c
Compile [ 99.6%]: arm_abs_q7.c
Compile [ 99.7%]: tflite-trained.cpp
Compile [ 99.9%]: arm_add_q15.c
Compile [100.0%]: main.cpp
[Fatal Error] ei_run_classifier.h@52,10: tflite-model/tflite-resolver.h: No such file or directory
[ERROR] In file included from :
././BUILD/DISCO_F746NG/GCC_ARM-RELEASE/mbed_config.h:79: warning: “MBED_ROM_SIZE” redefined
79 | #define MBED_ROM_SIZE 0x4000000 // defined by application
|
: note: this is the location of the previous definition
./source/main.cpp:17: warning: “EI_CLASSIFIER_TFLITE_INPUT_DATATYPE” redefined
17 | #define EI_CLASSIFIER_TFLITE_INPUT_DATATYPE EI_CLASSIFIER_DATATYPE_FLOAT32
|
In file included from ./source/benchmark_types.h:1,
from ./source/benchmark.h:1,
from ./source/main.cpp:3:
./model-parameters/model_metadata.h:54: note: this is the location of the previous definition
54 | #define EI_CLASSIFIER_TFLITE_INPUT_DATATYPE EI_CLASSIFIER_DATATYPE_INT8
|
./source/main.cpp:18: warning: “EI_CLASSIFIER_TFLITE_INPUT_QUANTIZED” redefined
18 | #define EI_CLASSIFIER_TFLITE_INPUT_QUANTIZED 0
|
In file included from ./source/benchmark_types.h:1,
from ./source/benchmark.h:1,
from ./source/main.cpp:3:
./model-parameters/model_metadata.h:55: note: this is the location of the previous definition
55 | #define EI_CLASSIFIER_TFLITE_INPUT_QUANTIZED 1
|
In file included from ./source/main.cpp:34:
./edge-impulse-sdk/classifier/ei_run_classifier.h:52:10: fatal error: tflite-model/tflite-resolver.h: No such file or directory
52 | #include “tflite-model/tflite-resolver.h”
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

[mbed] Working path “/app/benchmark” (program)
[mbed] ERROR: “/usr/bin/python3” returned error.
Code: 1
Path: “/app/benchmark”
Command: “/usr/bin/python3 -u /app/benchmark/mbed-os/tools/make.py -t GCC_ARM -m DISCO_F746NG --profile release --source . --build ./BUILD/DISCO_F746NG/GCC_ARM-RELEASE”
Tip: You could retry the last command with “-v” flag for verbose output

Error while finding memory:
Command ‘[‘node’, ‘/app/benchmark/benchmark.js’, ‘–tflite-type’, ‘float32’, ‘–tflite-file’, ‘/home/model.tflite’]’ returned non-zero exit status 1.
Traceback (most recent call last):
File “./resources/libraries/ei_tensorflow/profiling.py”, line 212, in profile_tflite_model
‘–tflite-file’, model_file
File “/usr/lib/python3.6/subprocess.py”, line 356, in check_output
**kwargs).stdout
File “/usr/lib/python3.6/subprocess.py”, line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘node’, ‘/app/benchmark/benchmark.js’, ‘–tflite-type’, ‘float32’, ‘–tflite-file’, ‘/home/model.tflite’]’ returned non-zero exit status 1.

daschwar · July 14, 2021, 2:33am

Hey Naveen,

Interesting set of errors here, would you share the project id or URL so I can try to replicate the conditions that caused the issue?

Best,
David

naveen · July 14, 2021, 2:38am

Hi David,

The Project ID is 39350.

Thanks
Naveen

janjongboom · July 14, 2021, 5:00am

Hi @naveen, this looks like you used an op that is not supported in TensorFlow Lite Micro which we use to profile the application afterwards. Should go through and finish training the model though just without memory information.

I’ll take a look at the error messages later today and see how we can improve there.

naveen · July 14, 2021, 1:52pm

Thanks @janjongboom for the pointers! I have fixed the offending op. It seems working OK now.

naveen · July 17, 2021, 3:03am

@janjongboom

This works:
model.add(preprocessing.Resizing(32, 32, interpolation='nearest'))

Compilation gets failed during profiling and deployment if bilinear is used as an interpolation method for the Resizing which is the default method.
model.add(preprocessing.Resizing(32, 32, interpolation='bilinear'))

I have checked the Tensorflow micro github repo and found both bilinear and nearest neighbor are the supported ops. Is this related to TFLite micro versions? I wanted to use bilinear since it increases the accuracy a bit.

janjongboom · July 17, 2021, 9:28am

Yeah we’re trailing behind TFLite quite a bit (by almost a year, some things are backported to our SDK), but are updating the TFLite kernels in the coming week.