Recommended Approach for XIAO 6-Axis IMU Gesture Recognition

Question/Issue: I want to use a 6 axis (accel, gyro) IMU (XIAO BLE Sense with LSM6DS3 IMU) to classify gesture data (punch type). I’m trying to figure out the best way to setup the impulse for this given the different ranges/units of measurement. Should I have 2 spectral analysis blocks, 1 for accel and 1 for gyro? Do I need to adjust the scale for these or does it not really matter for feeding it to the NN?

Hi @ontheedge,

I’m actually working on this exact project as well (although, mine is more of a wand) :slight_smile: From what I have found, the spectral blocks work well for repeating data, not necessarily a single gesture (e.g. a punch). The best approach I have found is this:

  1. Scale your data so that the accelerometer and gyroscope data is on the same scale (e.g. [0 to 1[ or [-1 to 1]). This can be done via “normalization” or “standardization.”
  2. Feed the raw (normalized or standardized) data directly into a neural network. You can try other blocks (e.g. spectral) to see if they make a difference (but they don’t help from what I’ve tried).
  3. A DNN with 80 nodes in the first layer and 40 nodes in the second layer worked well as a good starting point for me.
  4. When you deploy, make sure to scale your data using the same parameters (e.g. mean and std dev.) you calculated in step 1 before feeding it to the NN for inference.

Hope that helps!

1 Like

That is super useful, I’ll give it a shot and report back. I think I have some normalisation code for the XIAO BLE Sense from a pure TFLite experiment I did.

Any idea why the spectral analysis isn’t too helpful? I believe I read somewhere a low pass filter for an accelerometer and high pass for gyro could help though I was unsure what params to use.

What sort of accuracy are you achieving btw? And do you attempt to classify based on which hand is being used eg left hook, right hook? I think a jab/straight would be very difficult to differentiate but I may be wrong

Hi @ontheedge,

The spectral analysis block first performs an FFT of the data under the window in order to generate features. A Fourier Transform makes some assumptions about the raw data, such that it is periodic. So, if you do the FFT of your whole window, you end up with the frequency components (you lose any time-varying data in the signal).

The way around this (assuming you want to use the frequency components) is to create a spectrogram of the signal, much like how the MFE or MFCC blocks work (although, those are designed for audio data). The spectrogram allows you to see how the frequency components change over time.

Filters might still help, but you’d have to implement those yourself. The spectral analysis block filters the data and then always takes the FFT. At least that’s how it’s currently implemented.

That makes sense - thank you for the explanation. L I’ll go with no processing block first, then once I’ve got something decent I’ll see if I can achieve any gains with a custom processing block to apply filters (if necessary)

1 Like