Job failed when using NN classifier with both MFE and Spectrogram

This is the whole training output, note that I didn’t change any of the layers, and that I’m placing both MFE and Spectrogram processing blocks into the Classification(keras) learning block. I thought it might be because I’m using 256 frequency bands for the spectrogram, but I re-ran it with 128 and it didn’t change anything:

Creating job… OK (ID: 2445616)

Copying features from processing blocks…
Scheduling job in cluster…
Job started
[173/173] Merging DSP blocks…
[172/173] Merging DSP blocks…
Copying features from processing blocks OK

Scheduling job in cluster…
Job started
Splitting data into training and validation sets…
Splitting data into training and validation sets OK

Training model…
Training on 129 inputs, validating on 44 inputs
Traceback (most recent call last):
File “/home/”, line 362, in
File “/home/”, line 183, in main_function
model, override_mode, disable_per_channel_quantization = train_model(train_dataset, validation_dataset,
File “/home/”, line 78, in train_model
model.add(Reshape((rows, columns, channels), input_shape=(input_length, )))
File “/app/keras/.venv/lib/python3.8/site-packages/tensorflow/python/training/tracking/”, line 530, in _method_wrapper
result = method(self, *args, **kwargs)
File “/app/keras/.venv/lib/python3.8/site-packages/keras/utils/”, line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File “/app/keras/.venv/lib/python3.8/site-packages/keras/layers/core/”, line 110, in _fix_unknown_dimension
raise ValueError(msg)
ValueError: Exception encountered when calling layer “reshape” (type Reshape).

total size of new array must be unchanged, input_shape = [14642], output_shape = [366, 40, 1]

Call arguments received:
• inputs=tf.Tensor(shape=(None, 14642), dtype=float32)
Application exited with code 1

Job failed (see above)

This is the Keras code:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, InputLayer, Dropout, Conv1D, Conv2D, Flatten, Reshape, MaxPooling1D, MaxPooling2D, BatchNormalization, TimeDistributed
from tensorflow.keras.optimizers import Adam

# model architecture
model = Sequential()
channels = 1
columns = 40
rows = int(input_length / (columns * channels))
model.add(Reshape((rows, columns, channels), input_shape=(input_length, )))
model.add(Conv2D(8, kernel_size=3, activation='relu', kernel_constraint=tf.keras.constraints.MaxNorm(1), padding='same'))
model.add(MaxPooling2D(pool_size=2, strides=2, padding='same'))
model.add(Conv2D(16, kernel_size=3, activation='relu', kernel_constraint=tf.keras.constraints.MaxNorm(1), padding='same'))
model.add(MaxPooling2D(pool_size=2, strides=2, padding='same'))
model.add(Dense(classes, activation='softmax', name='y_pred'))

# this controls the learning rate
opt = Adam(learning_rate=0.002, beta_1=0.9, beta_2=0.999)
# this controls the batch size, or you can manipulate the objects yourself
train_dataset = train_dataset.batch(BATCH_SIZE, drop_remainder=False)
validation_dataset = validation_dataset.batch(BATCH_SIZE, drop_remainder=False)
callbacks.append(BatchLoggerCallback(BATCH_SIZE, train_sample_count))

# train the neural network
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy']), epochs=400, validation_data=validation_dataset, verbose=2, callbacks=callbacks,

# Use this flag to disable per-channel quantization for a model.
# This can reduce RAM usage for convolutional models, but may have
# an impact on accuracy.
disable_per_channel_quantization = False

Hello @daniellm19

So there is an issue with your reshape layer.

If I do this calculation: 366x40=14 640 it is different from your input (14642).
Also, as your 2 DSP blocks have different “shapes”, the convolutional network won’t work unless you do further adjustments.

Just out of curiosity, why do you see interest in having both MFE and Spectrogram? Both will share a lot of common features.



Thanks for the response,

Well, I’m mostly just tinkering and trying out different things blindly without true understanding, here are my current misunderstandings/problems:

  1. If I’m trying to identify and classify audio that has a rather high SNR (because the non-voice sound I’m trying to indentify is very quiet), and it’s really struggling with just MFE to classify the audio correctly. Is it better to use MFE or a Spectogram (or possibly both?)(or is this an unfixable issue due to SNR? Or should you get more data in this case?)?
  2. What are these common features that MFE and a Spectrogram will share? What are the pros/cons.
  3. How would you go about fixing the DSP blocks so that the shapes match?
  4. Would it not be wise to have some sort of warning message when the shapes aren’t the same (so beginners like me don’t go posting on the forum endlessly hehe)

Regards, Daníel.