Failing to train dataset - errors

lazybilly · March 30, 2021, 1:47pm

Hi,

I am trying to train an object detection dataset that can localise three different classes of objects within an image. I have been successful in uploading my image dataset, creating an impulse and generating features, but when i start to train MobileNetV2 SSD FPN-Lite 320x320 I am met with some errors which I do not understand how to correct.

I will attach a copy of my training output, I am sorry for including what is probably redundant information by attaching my whole training output… but i dont want to risk not including the correct info.
Could some one please be able to help me with this?

Kind regards

Training Output**************

Creating job… OK (ID: 677579)

Copying features from processing blocks…
Copying features from DSP block…
Still copying 5%…
Still copying 9%…
Still copying 14%…
Still copying 19%…
Still copying 23%…
Still copying 28%…
Still copying 33%…
Still copying 37%…
Still copying 42%…
Still copying 46%…
Still copying 51%…
Still copying 55%…
Still copying 60%…
Still copying 64%…
Still copying 69%…
Still copying 73%…
Still copying 78%…
Still copying 82%…
Still copying 87%…
Still copying 91%…
Still copying 96%…
Copying features from DSP block OK
Copying features from processing blocks OK

Scheduling job in cluster…
Job started
Splitting data into training and validation sets…
Splitting data into training and validation sets OK

Training model…
Training on 180 inputs, validating on 46 inputs
Building model and restoring weights for fine-tuning…
Finished restoring weights
Fine tuning…
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/context.py”, line 2113, in execution_mode
yield
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py”, line 733, in _next_internal
output_shapes=self._flat_output_shapes)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_dataset_ops.py”, line 2579, in iterator_get_next
_ops.raise_from_not_ok_status(e, name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py”, line 6862, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File “”, line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError: IndexError: invalid index to scalar variable.
Traceback (most recent call last):

File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/script_ops.py”, line 247, in call
return func(device, token, args)

File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/script_ops.py”, line 135, in call
ret = self._func(*args)

File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py”, line 620, in wrapper
return func(*args, **kwargs)

File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py”, line 960, in generator_py_func
values = next(generator_state.get_iterator(iterator_id.numpy()))

File “./resources/libraries/ei_tensorflow/training.py”, line 91, in gen
raw_boxes = Y_values[ix][‘boundingBoxes’]

IndexError: invalid index to scalar variable.

 [[{{node EagerPyFunc}}]] [Op:IteratorGetNext]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/train.py”, line 240, in
main_function()
File “/home/train.py”, line 111, in main_function
X_train, X_test, Y_train, Y_test, len(X_train), classes, classes_values)
File “/home/train.py”, line 49, in train_model
train_dataset, validation_dataset)
File “./resources/libraries/ei_tensorflow/object_detection.py”, line 101, in train
for batch in train_dataset:
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py”, line 747, in next
return self._next_internal()
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/iterator_ops.py”, line 739, in _next_internal
return structure.from_compatible_tensor_list(self._element_spec, ret)
File “/usr/lib/python3.6/contextlib.py”, line 99, in exit
self.gen.throw(type, value, traceback)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/context.py”, line 2116, in execution_mode
executor_new.wait()
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/executor.py”, line 69, in wait
pywrap_tfe.TFE_ExecutorWaitForAllPendingNodes(self._handle)
tensorflow.python.framework.errors_impl.UnknownError: IndexError: invalid index to scalar variable.
Traceback (most recent call last):

File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/script_ops.py”, line 247, in call
return func(device, token, args)

File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/script_ops.py”, line 135, in call
ret = self._func(*args)

File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py”, line 620, in wrapper
return func(*args, **kwargs)

File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/data/ops/dataset_ops.py”, line 960, in generator_py_func
values = next(generator_state.get_iterator(iterator_id.numpy()))

File “./resources/libraries/ei_tensorflow/training.py”, line 91, in gen
raw_boxes = Y_values[ix][‘boundingBoxes’]

IndexError: invalid index to scalar variable.

 [[{{node EagerPyFunc}}]]

Application exited with code 1 (Error)

Job failed (see above)

janjongboom · March 30, 2021, 2:21pm

Hi @lazybilly the object detection training block was published by accident, and should not have been available to the public. You can use the image classification block for now.

janjongboom · March 30, 2021, 2:27pm

This is now hidden again, sorry for the inconvenience! We hope to have better news about availability soon

lazybilly · March 30, 2021, 11:38pm

Thank you very much for your reply mate!

I am not sure what you mean when you say that the object detection block should not have been made public? Normal people shouldnt have access to this block?

My goal is to localise three different classes of objects within an image, will I still be able to achieve this with image classification? My past reading has told me that classification will only determine the single most probable class of the entire image, not to localise (with bounding boxes) different classes of object within the same image.

Thank you very much for your time

dhruvsheth · March 31, 2021, 2:45am

Even while using the transfer learning block, the image classification pertains to only MCUs like Himax Board, Eta Compute Board. But upon uploading these cropped images for a particular class with a label, it takes it as an object to be identified for OpenMV board. So if you wish to generate a .tflite file, you will get object detection with bounding boxes for .tflte files.

I commented here, because I saw your comment on github earlier.

Thanks.

P.S - The object detection block is in beta version(yet in dvelopment)

janjongboom · April 1, 2021, 9:26am

Hi @lazybilly, yes, for three different objects in one image you’ll need object detection. As @dhruvsheth said this is still in development and will be out later this month for everyone.

Note that object detection requires significantly more compute power than image classification.