Differing Window Size in Gesture Recognition

Question/Issue: I’m developing a gesture recognition demo which is based on the continuous motion guide. The idea is when an accelerometer reading goes above a certain threshold, read its values for Xms, then classify. I’m also using an anomaly detection block to discard irrelevant samples. Though it’s working, I’m seeing poor performance in anomaly detection which I believe is at least in part due to low data volume (training on an embarrassing 30 samples per class).

I have a couple of questions:

  • What is the best way to deal with differing gesture speeds? e.g. 1 user may take 800ms to perform a gesture, whereas another may take 1200ms. Is it something to do with the Window increase setting?
  • What is the cluster distance used in the live classification for something being considered an anomaly?

Unfortunately, the best way I have found to deal with different speeds is to make the window as big as the longest predicted time (e.g. if you expect the longest gesture to take 1.2 s, then make the window 1200 ms). Then, collect lots of data for that class demonstrating the different speeds to train the model to look for all of those possibilities.

Alternatively, you could train a smaller window to look for subsets of the gesture. Just know that you increase the possibility of false positives and false negatives doing this.

For anomaly detection, the lower the value, the closer the sample is to the center of a cluster. For example, -0.5 is close to the center. 0.0 is near the edge. Generally, I set my threshold to be about 0.3–anything over that is considered an “anomaly.”

1 Like

Thanks. I think I’ll try the larger window first and see how it performs. FYI I’m also considering multiple models based on skill level. For example, someone with boxing training is likely to have a far quicker punch for all punch types than an absolute beginner

1 Like