Help with this failed job?

haolinhe · April 29, 2021, 9:52am

it seems the error is this:

ValueError: Shape (320, 320, 3) must have rank 4

I have no idea how to correct this. Please help.

The whole processing job is here:

Creating job… OK (ID: 796483)

Retraining Image…
Scheduling job in cluster…
Job started
Creating windows from 322 files…
[ 1/322] Creating windows from files…
[ 14/322] Creating windows from files…
[ 48/322] Creating windows from files…
[ 97/322] Creating windows from files…
[145/322] Creating windows from files…
[194/322] Creating windows from files…
[240/322] Creating windows from files…
[292/322] Creating windows from files…
[322/322] Creating windows from files…
[322/322] Creating windows from files…

Created 322 windows: rock: 157, tree trunk: 165

Creating features
[ 1/322] Creating features…
[ 30/322] Creating features…
[ 60/322] Creating features…
[ 91/322] Creating features…
[114/322] Creating features…
[145/322] Creating features…
[173/322] Creating features…
[204/322] Creating features…
[235/322] Creating features…
[266/322] Creating features…
[297/322] Creating features…
[322/322] Creating features…
Created features

Scheduling job in cluster…
Job started
Reducing dimensions for visualizations…
UMAP(a=None, angular_rp_forest=False, b=None,
force_approximation_algorithm=False, init=‘spectral’, learning_rate=1.0,
local_connectivity=1.0, low_memory=False, metric=‘euclidean’,
metric_kwds=None, min_dist=0.1, n_components=3, n_epochs=None,
n_neighbors=15, negative_sample_rate=5, output_metric=‘euclidean’,
output_metric_kwds=None, random_state=None, repulsion_strength=1.0,
set_op_mix_ratio=1.0, spread=1.0, target_metric=‘categorical’,
target_metric_kwds=None, target_n_neighbors=-1, target_weight=0.5,
transform_queue_size=4.0, transform_seed=42, unique=False, verbose=True)
Construct fuzzy simplicial set
Thu Apr 29 09:23:44 2021 Finding Nearest Neighbors
Thu Apr 29 09:23:46 2021 Finished Nearest Neighbor Search
Thu Apr 29 09:23:48 2021 Construct embedding
Still running…
completed 0 / 500 epochs
completed 50 / 500 epochs
completed 100 / 500 epochs
completed 150 / 500 epochs
completed 200 / 500 epochs
completed 250 / 500 epochs
completed 300 / 500 epochs
completed 350 / 500 epochs
completed 400 / 500 epochs
completed 450 / 500 epochs
Thu Apr 29 09:23:51 2021 Finished embedding
Reducing dimensions for visualizations OK
Retraining Image OK

Retraining Object detection…
Copying features from processing blocks…
Copying features from DSP block…
Still copying 4%…
Still copying 9%…
Still copying 14%…
Still copying 18%…
Still copying 22%…
Still copying 27%…
Still copying 32%…
Still copying 36%…
Still copying 41%…
Still copying 46%…
Still copying 50%…
Still copying 55%…
Still copying 60%…
Still copying 65%…
Still copying 70%…
Still copying 75%…
Still copying 79%…
Still copying 84%…
Still copying 89%…
Still copying 93%…
Still copying 98%…
Copying features from DSP block OK
Copying features from processing blocks OK

Scheduling job in cluster…
Job started
Splitting data into training and validation sets…
Splitting data into training and validation sets OK

Training model…
Training on 257 inputs, validating on 65 inputs
Building model and restoring weights for fine-tuning…
Finished restoring weights
Fine tuning…
Attached to job 796483…
Attached to job 796483…
Attached to job 796483…
Traceback (most recent call last):
File “/home/train.py”, line 240, in
main_function()
File “/home/train.py”, line 111, in main_function
X_train, X_test, Y_train, Y_test, len(X_train), classes, classes_values)
File “/home/train.py”, line 49, in train_model
train_dataset, validation_dataset)
File “./resources/libraries/ei_tensorflow/object_detection.py”, line 116, in train
val_loss += validation_fn(image_tensors, gt_boxes_list, gt_classes_list)
File “./resources/libraries/ei_tensorflow/object_detection.py”, line 185, in validation_function
prediction_dict = model.predict(concatted, shapes)
File “/usr/local/lib/python3.6/dist-packages/object_detection/meta_architectures/ssd_meta_arch.py”, line 570, in predict
feature_maps = self._feature_extractor(preprocessed_inputs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py”, line 1012, in call
outputs = call_fn(inputs, *args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/object_detection/meta_architectures/ssd_meta_arch.py”, line 251, in call
return self._extract_features(inputs)
File “/usr/local/lib/python3.6/dist-packages/object_detection/models/ssd_mobilenet_v2_fpn_keras_feature_extractor.py”, line 217, in _extract_features
33, preprocessed_inputs)
File “/usr/local/lib/python3.6/dist-packages/object_detection/utils/shape_utils.py”, line 280, in check_min_image_dim
image_height = static_shape.get_height(image_shape)
File “/usr/local/lib/python3.6/dist-packages/object_detection/utils/static_shape.py”, line 63, in get_height
tensor_shape.assert_has_rank(rank=4)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_shape.py”, line 1014, in assert_has_rank
raise ValueError(“Shape %s must have rank %d” % (self, rank))
ValueError: Shape (320, 320, 3) must have rank 4

Application exited with code 1 (Error)

Job failed (see above)

janjongboom · April 29, 2021, 11:10am

@haolinhe it’s a bug in the way we do batching for object detection. You can fix it by removing (or adding) one extra image to your dataset, and we’re release a fix later today.

haolinhe · April 30, 2021, 3:03am

I removed an image from my training and testing set but I still run into the same error… ValueError: Shape (320, 320, 3) must have rank 4.

It would be great if you could find a fix to this! Thanks.

janjongboom · April 30, 2021, 9:24am

@haolinhe Now properly fixed, and your model is now training

haolinhe · April 30, 2021, 5:34pm

Perfect! Thank you so much!