Spectral features - 'scale axes' impact

I’m working on a gesture recognition problem with a 3 axes accelerometer and just noticed that the ‘scale axes’ parameter in the spectral feature extractor makes a huge difference in performance (‘0.001’ value provided much better performance than the default ‘1’ value). My accelerometer data is in the typical range of [-32768, 32767]. Why does this parameter make such a big difference in the performance of the model?

It is common practice to make the inputs to the neural network normalized between [0,1] or [-1, 1]. Most activation functions like ReLU and sigmoid don’t like big numbers, it makes them saturate, prevents learning, or makes it very difficult.

I leave it to the reader to check some articles on this topic since they are pretty well written.

Finally, I have found myself that for accelerometer data subtracting the mean over the time axis did a better job than normalization or standardization but it may differ depending on the application.

Good luck!

1 Like

@PHAN good addition. We actually do mean subtraction in the pipeline already (after applying a filter). See https://docs.edgeimpulse.com/docs/spectral-features for some background on what we do.