How to properly retrain wake word model after initial walkthrough?

Question/Issue:

Hi all,

I’m currently working on improving the accuracy of a wake word model (“Wil”) that I created using the initial walkthrough when setting up my Edge Impulse account. During that process, I followed the prompt to record around 38 seconds of audio samples containing the wake word.

After completing the walkthrough, I received a trained model, but the recognition accuracy is currently around 6 out of 10. So I understand that to improve it, I need to retrain it with more samples. Here’s what I did:

What I did so far:

  1. I went to “Data Acquisition”“Upload data”, and uploaded a folder named wake_words that contains ~600 .wav samples (both positive and negative).
  2. Then, I went to “Transfer Learning” and clicked “Save & Train” to start training on the new data.
  3. I visited the “Versioning” tab and created a new Project Version. The original (from onboarding) had ~7 minutes of training data, this new version has ~13 minutes.

My questions:

a) Is this the correct procedure?
Am I missing any important step to ensure that the new data is being used effectively in training?

b) My current model size is around 500KB, and I know I can go up to 1MB.
Does increasing the model size help improve quality? If so, how can I control or adjust the model size?

c) I noticed that even after uploading more samples and training, the performance didn’t improve much.
So I tried clicking “Retrain model” on the right side panel. But the retraining failed (no clear error message was shown).
Is clicking “Retrain model” actually required after uploading new data? Or is “Save & Train” inside the “Transfer Learning” tab sufficient?

Ultimately, my goal is to improve the detection accuracy of the wake word “Wil”.
Right now, it’s missing too many activations. Any advice or guidance to improve the model performance would be greatly appreciated.

Thanks in advance!
Francisco


Project ID:
FranGalan-project-1

Hi @FranGalan - Here’s some info that may help!

a) If you add more data to your project, you also need to regenerate the MFE features before you retrain your model. Otherwise you are simply retraining the model with the existing features. A quick way to do this is to use the “Retrain model” menu item in the left side bar. You’ll notice that when you make changes to your dataset, a red dot will appear on this icon.

There is no need to create a new version of the project unless you have a state that you would like to restore at some point. Versioning creates a snapshot of the project exactly as it is, so you can restore that later if you want.

b) Increasing the model size doesn’t necessarily improve quality. There is a balance between model capacity (~size) and dataset complexity. If your model capacity is much greater than your dataset complexity, you can get into situations where the model overfits (essentially memorizes the training data, so it performs well on that, but does not perform well on new unseen data).

How to increase the model size is a question with a broad answer. Since you are using the Transfer learning (Keyword Spotting) block, you could select a different model for that block. See the doc I linked. You could also try a Classification block and modify the model by adding layers graphically. See the model architecture used in our Keyword spotting tutorial. You can also modify any block using Expert mode if you are comfortable with Python and TensorFlow.

c) See answer to point a). Without more info, I’m not really sure why it would have failed. Also, in general keyword spotting models will perform better on words that have more than one syllable. Something like “Wil” can more easily get confused for noise.