Question/Issue:
Running my command classifying model continuously with a selected config from the performance calibration page results in each classification having a confidence score of 0 during runtime (on the dev kit). Running this model continuously without a config results in expected results (non zero confidence scores).
Currently running this on Silabs xG24 dev kit board with code compiled in Simplicity Studio. Project id# 139644
Thank you for posting, and sorry for the delay in responding! I designed the performance calibration feature and our support team asked me to take a look at your project.
Firstly, an all-zeroes confidence score means the post-processing filter has decided to filter out that particular inference—so yes, it’s a feature, not a bug. When a performance calibration configuration is selected, the output will include a non-zero value only when an event has been positively detected by the filter.
In terms of your actual project: I created a copy in our system and ran performance calibration. It looks like that your model currently has a very high false rejection rate (~70%), which you’ll notice on-device as a lack of responsiveness to any of your keywords.
The system tries to find a post-processing configuration that maximizes performance across all of your keywords. With your project, I’ve found that if I disable some of the keywords (by selecting them in the options under “Are there any other labels that should be ignored by your application?”) I see different results: for example, if I leave just “okay_leviton” enabled it performs quite well, whereas “brighten_lights” has a high false negative rate. With a combination of “good” and “bad” keywords enabled, the system will do its best to find a compromise that maximizes performance across them all: but it will struggle.
Your model is getting great performance in model testing. One big difference in performance calibration is that the audio is mixed with background noise. Listening to the audio, your keyword recordings sound really quiet; it might be worth boosting the volume in the samples you have uploaded so that they are audible above the noise floor. Also, you should try enabling data augmentation when training your model. You can add noise, which may help the model get used to coping with background noise in real world usage. Hopefully these tips help!
I’ve noted a few things we can improve here so that your experience with performance calibration is better next time. They are:
Making it easier to understand the output of the continuous classification function
Communicating when the results of performance calibration are not good, so it’s not a surprise after deployment
Providing some actionable advice based on the results of performance calibration
Making it possible to configure the noise floor added by performance calibration
Thanks so much for reaching out—I’d love to hear how this project goes, and any feedback you might have while using this feature.