Biometric sound recognition

josemimo2 · March 29, 2023, 3:16pm

Hi guys, I am a computer science student who’s currently using edge-impulse for his final degree project. I would like to know if it could be possible to make a key word spotting model like the one which is in the tutorial section but only working with an specific person voice. If someone have an idea I will be very grateful.

Eoin · March 30, 2023, 11:37am

Welcome @josemimo2

Awesome you are using Edge Impulse for your project!

Not sure how much advice we can give on this. Look for some review papers on the subject and discuss it with your supervisor though, you may find that pitch is used for identification.

To build a small test on pitch you could build a model using “approve” or “deny” for a given keyword. You can do this with a classmate or shift the pitch on a recording of your own voice.

Please do share your project with us once its published, and you can reference us through the publication we made with Harvard: [2212.03332] Edge Impulse: An MLOps Platform for Tiny Machine Learning

Best,

Eoin

louis · March 31, 2023, 7:50am

Hello @josemimo2,

Not entirely sure that it will work but I’ll try a spectrogram pre-processing + anomaly detection learning block for this kind of project. As @Eoin mentioned, I’d also start with a given keyword.

Let us know of your results, I’m curious.

Best,

Louis

Joeri · March 31, 2023, 9:09am

Can you elaborate more on this?

MMarcial · April 1, 2023, 10:46pm

@Joeri I assume an example scenario @josemimo2 is talking about is: you walk up to your front door on your house and say “unlock door”. Of course the door should not open for anyone else even is the know the magical open sesame phrase.

@josemimo2 if you get this working make the Edge Impulse Studio project public. Given the trained voice Samples will be in the Project, I would like to see if I or maybe a voice actor could replicate the trained voice. Given that AI/ML can now replicate anyone’s voice, this may not be a fail-safe manner to control something. You’ll need a defense-in-depth approach. Maybe add a geo fence, finger-print scanner, forehead temperature checks (remember those?), etc.

Joeri · April 3, 2023, 7:35am

You can have different scenarios for this type of use case. And indeed, in the case of security, you should have a combination of varying biometrics.

I think you will end with a very unbalanced dataset.

josemimo2 · April 12, 2023, 11:12am

Basically I want to test if it is possible to identify the voice of a determined person. For example, in a security enviroment if I say “Open Door” the device should open the door, in other case where the person is not me, the door shouldn’t be opened.

josemimo2 · April 12, 2023, 11:14am

Anomaly detection doesn’t work fine with spectrograms because of the features number