I have an unusual keyword spotting project requiring assistance

I want to make a device that recognizes sounds that are not human voices, such as claps or farts. I gathered 122 sample fart noises for the project (that was a hard proccess to gather them :joy:). But when I train it with “noise” and “unknown” samples, it gives me a very low percentage; sometimes it guesses 100% of the fart noises are “noise”. I need help with the training process. Which training program should I use?