Standalone Audio Inference Locally on PC

Hi all,

I am trying local audio inferencing for the first time. So it is evident that I will miss something out.
Here are the steps that I did:

  1. I trained the model in EI studio with my audio samples and downloaded the C++ library
  2. Cloned the Linux standalone repo and merged the downloaded C++ library contents as advised here
  3. Built the project and ran the binary file

However, it is not able to set the parameters as stated in the output shown below.

cris@T580:~/Downloads/example-standalone-inferencing-linux/build$ ./audio plughw:0,0
cannot set parameters (Cannot allocate memory)

Here is the output for the sound card query.

cris@T580:~/Downloads/example-standalone-inferencing-linux/build$ cat /proc/asound/cards
 0 [PCH            ]: HDA-Intel - HDA Intel PCH
                  HDA Intel PCH at 0xdc248000 irq 167

Please let me know if I missed out something.

@crisdeodates Hmm… so this comes from snd_pcm_hw_params - do you have any way of seeing if you can record / play anything using the sound card? I see this error sometimes with misconfigured asound, e.g.:

https://bbs.archlinux.org/viewtopic.php?id=256191

@janjongboom thanks for the response.
I am using the inbuilt microphone of my laptop which works fine with recoding (tried audacity).

I tried this to find the correct ID of the sound card:

$ arecord -L
default
    Playback/recording through the PulseAudio sound server
surround21
    2.1 Surround output to Front and Subwoofer speakers
surround40
    4.0 Surround output to Front and Rear speakers
surround41
    4.1 Surround output to Front, Rear and Subwoofer speakers
surround50
    5.0 Surround output to Front, Center and Rear speakers
surround51
    5.1 Surround output to Front, Center, Rear and Subwoofer speakers
surround71
    7.1 Surround output to Front, Center, Side, Rear and Woofer speakers
null
    Discard all samples (playback) or generate zero samples (capture)
samplerate
    Rate Converter Plugin Using Samplerate Library
speexrate
    Rate Converter Plugin Using Speex Resampler
jack
    JACK Audio Connection Kit
oss
    Open Sound System
pulse
    PulseAudio Sound Server
upmix
    Plugin for channel upmix (4,6,8)
vdownmix
    Plugin for channel downmix (stereo) with a simple spacialization
sysdefault:CARD=PCH
    HDA Intel PCH, ALC257 Analog
    Default Audio Device
front:CARD=PCH,DEV=0
    HDA Intel PCH, ALC257 Analog
    Front speakers
dmix:CARD=PCH,DEV=0
    HDA Intel PCH, ALC257 Analog
    Direct sample mixing device
dsnoop:CARD=PCH,DEV=0
    HDA Intel PCH, ALC257 Analog
    Direct sample snooping device
hw:CARD=PCH,DEV=0
    HDA Intel PCH, ALC257 Analog
    Direct hardware device without any conversions
plughw:CARD=PCH,DEV=0
    HDA Intel PCH, ALC257 Analog
    Hardware device with all software conversions
usbstream:CARD=PCH
    HDA Intel PCH
    USB Stream Output

And then tried the binary again.

$ ./audio plughw:PCH,0
cannot set parameters (Cannot allocate memory)

Still the same issue !!
Not sure if any sound settings are wrong. But doesn’t seem so as the recording is working with other applications.

@crisdeodates If you collect data with edge-impulse-linux do you actually get recorded audio? If so, can you post the output of edge-impulse-linux --verbose when you start recording some sounds? Really interested to see which device ID is selected there.

Interestingly, the audio is not getting recorded.

Ran the following:
$ edge-impulse-linux --verbose

Tried to sample and got the below vebrose output:

$ edge-impulse-linux --verbose
Edge Impulse Linux client v1.2.6

[SER] Using microphone hw:0,0
[GST] Found devices: [
  {
    "id": "/dev/video1",
    "name": "Integrated Camera",
    "caps": [
      {
        "type": "video/x-raw",
        "width": 1280,
        "height": 720,
        "framerate": 10
      },
      {
        "type": "video/x-raw",
        "width": 960,
        "height": 540,
        "framerate": 15
      },
      {
        "type": "video/x-raw",
        "width": 848,
        "height": 480,
        "framerate": 20
      },
      {
        "type": "video/x-raw",
        "width": 640,
        "height": 480,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 640,
        "height": 360,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 424,
        "height": 240,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 352,
        "height": 288,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 320,
        "height": 240,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 320,
        "height": 180,
        "framerate": 30
      }
    ]
  },
  {
    "id": "/dev/video0",
    "name": "Droidcam",
    "caps": [
      {
        "type": "video/x-raw",
        "width": 640,
        "height": 480,
        "framerate": 30
      }
    ]
  }
]
[SER] Using camera Integrated Camera starting...
[GST] Starting gst-launch-1.0 with [
  'v4l2src',
  'device=/dev/video1',
  '!',
  'video/x-raw,width=640,height=480',
  '!',
  'videoconvert',
  '!',
  'jpegenc',
  '!',
  'multifilesink',
  'location=test%05d.jpg'
]
[GST] Setting pipeline to PAUSED ...

[GST] Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...

[GST] New clock: GstSystemClock

[SER] Connected to camera
[WS ] Connecting to wss://remote-mgmt.edgeimpulse.com
[WS ] Connected to wss://remote-mgmt.edgeimpulse.com
[GST] Got snapshot test00003.jpg time since last: 296ms. size
[GST] Got snapshot test00005.jpg time since last: 203ms. size
[GST] Got snapshot test00008.jpg time since last: 294ms. size
[WS ] Device "cris-laptop" is now connected to project "Test_project"
[WS ] Go to https://studio.edgeimpulse.com/studio/34910/acquisition/training to build your machine learning model!
[GST] Got snapshot test00010.jpg time since last: 199ms. size
[GST] Got snapshot test00012.jpg time since last: 199ms. size
[GST] Got snapshot test00014.jpg time since last: 200ms. size
[GST] Got snapshot test00016.jpg time since last: 204ms. size
[GST] Got snapshot test00018.jpg time since last: 198ms. size
[GST] Got snapshot test00021.jpg time since last: 294ms. size
[GST] Got snapshot test00023.jpg time since last: 200ms. size
[GST] Got snapshot test00026.jpg time since last: 304ms. size
[GST] Got snapshot test00028.jpg time since last: 199ms. size
[GST] Got snapshot test00031.jpg time since last: 295ms. size
[WS ] Incoming sampling request {
  path: '/api/training/data',
  label: 'hello',
  length: 10000,
  interval: 0.0625,
  hmacKey: 'a77727e728c204c0e688cefb06de9cf7',
  sensor: 'Microphone'
}
[SER] Waiting 2 seconds
[WS ] Failed to sample data Error: Missing "sox" in PATH.
    at AudioRecorder.start (/home/cris/.npm-global/lib/node_modules/edge-impulse-linux/build/library/sensors/recorder.js:64:19)
    at processTicksAndRejections (node:internal/process/task_queues:94:5)
    at async LinuxDevice.sampleRequest (/home/cris/.npm-global/lib/node_modules/edge-impulse-linux/build/cli/linux/linux.js:191:27)
    at async WebSocket.<anonymous> (/home/cris/.npm-global/lib/node_modules/edge-impulse-linux/build/shared/daemon/remote-mgmt-service.js:164:21)
[GST] Got snapshot test00033.jpg time since last: 201ms. size
[GST] Got snapshot test00035.jpg time since last: 201ms. size
[GST] Got snapshot test00037.jpg time since last: 198ms. size
[GST] Got snapshot test00039.jpg time since last: 199ms. size
[GST] Got snapshot test00042.jpg time since last: 294ms. size
[GST] Got snapshot test00044.jpg time since last: 204ms. size
[GST] Got snapshot test00047.jpg time since last: 294ms. size
[GST] Got snapshot test00049.jpg time since last: 199ms. size
[GST] Got snapshot test00051.jpg time since last: 200ms. size
[GST] Got snapshot test00053.jpg time since last: 200ms. size
[GST] Got snapshot test00055.jpg time since last: 200ms. size
[GST] Got snapshot test00057.jpg time since last: 199ms. size
[GST] Got snapshot test00059.jpg time since last: 203ms. size
[GST] Got snapshot test00062.jpg time since last: 294ms. size
[GST] Got snapshot test00064.jpg time since last: 200ms. size
[GST] Got snapshot test00066.jpg time since last: 204ms. size
[GST] Got snapshot test00069.jpg time since last: 294ms. size
[GST] Got snapshot test00071.jpg time since last: 200ms. size
[GST] Got snapshot test00073.jpg time since last: 200ms. size
[GST] Got snapshot test00075.jpg time since last: 199ms. size
[GST] Got snapshot test00077.jpg time since last: 204ms. size
[GST] Got snapshot test00079.jpg time since last: 198ms. size
^C[GST] handling interrupt.

[SER] Received stop signal, stopping application... Press CTRL+C again to force quit.
[GST] Interrupt: Stopping pipeline ...
Execution ended after 0:00:08.347400068
Setting pipeline to PAUSED ...
Setting pipeline to READY ...

Also, there is the same error warning from Edge Impulse studio

@crisdeodates sorry missed this thread, but you need to have sox installed and in your PATH. You can get it through apt install -y sox / brew install sox.

1 Like

Thanks, @janjongboom. This has solved the issue of the error while recording audio.
The verbose output below:

$ edge-impulse-linux --verbose
Edge Impulse Linux client v1.2.6

[SER] Using microphone hw:0,0
[GST] Found devices: [
  {
    "id": "/dev/video1",
    "name": "Integrated Camera",
    "caps": [
      {
        "type": "video/x-raw",
        "width": 1280,
        "height": 720,
        "framerate": 10
      },
      {
        "type": "video/x-raw",
        "width": 960,
        "height": 540,
        "framerate": 15
      },
      {
        "type": "video/x-raw",
        "width": 848,
        "height": 480,
        "framerate": 20
      },
      {
        "type": "video/x-raw",
        "width": 640,
        "height": 480,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 640,
        "height": 360,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 424,
        "height": 240,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 352,
        "height": 288,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 320,
        "height": 240,
        "framerate": 30
      },
      {
        "type": "video/x-raw",
        "width": 320,
        "height": 180,
        "framerate": 30
      }
    ]
  },
  {
    "id": "/dev/video0",
    "name": "Droidcam",
    "caps": [
      {
        "type": "video/x-raw",
        "width": 640,
        "height": 480,
        "framerate": 30
      }
    ]
  }
]
[SER] Using camera Integrated Camera starting...
[GST] Starting gst-launch-1.0 with [
  'v4l2src',
  'device=/dev/video1',
  '!',
  'video/x-raw,width=640,height=480',
  '!',
  'videoconvert',
  '!',
  'jpegenc',
  '!',
  'multifilesink',
  'location=test%05d.jpg'
]
[GST] Setting pipeline to PAUSED ...

[GST] Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...

[GST] New clock: GstSystemClock

[SER] Connected to camera
[WS ] Connecting to wss://remote-mgmt.edgeimpulse.com
[WS ] Connected to wss://remote-mgmt.edgeimpulse.com
[GST] Got snapshot test00003.jpg time since last: 295ms. size
[GST] Got snapshot test00005.jpg time since last: 199ms. size
[GST] Got snapshot test00007.jpg time since last: 199ms. size
[WS ] Device "cris-laptop" is now connected to project "Test_project"
[WS ] Go to https://studio.edgeimpulse.com/studio/30100/acquisition/training to build your machine learning model!
[GST] Got snapshot test00009.jpg time since last: 200ms. size
[GST] Got snapshot test00011.jpg time since last: 200ms. size
[GST] Got snapshot test00013.jpg time since last: 199ms. size
[GST] Got snapshot test00015.jpg time since last: 200ms. size
[GST] Got snapshot test00017.jpg time since last: 200ms. size
[GST] Got snapshot test00019.jpg time since last: 200ms. size
[GST] Got snapshot test00021.jpg time since last: 199ms. size
[GST] Got snapshot test00023.jpg time since last: 199ms. size
[WS ] Incoming sampling request {
  path: '/api/training/data',
  label: 'test',
  length: 2000,
  interval: 0.0625,
  hmacKey: 'ed7f4620a5915677614fa16fe6b9b7ad',
  sensor: 'Microphone'
}
[SER] Waiting 2 seconds
Recording via:  sox [
  '-t',     'alsa',
  'hw:0,0', '-q',
  '-r',     '16000',
  '-c',     '1',
  '-e',     'signed-integer',
  '-b',     '16',
  '-t',     'raw',
  '-'
] {}
Recording 1 channels with sample rate 16000...
[GST] Got snapshot test00025.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 ff ff 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ff ff 00 00 ... 4046 more bytes>
[GST] Got snapshot test00027.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer 00 00 00 00 ff ff 00 00 01 00 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 ff ff 00 00 00 00 ff ff 00 00 00 00 01 00 00 00 00 00 ff ff ... 4046 more bytes>
Recording 4096 bytes <Buffer 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 01 00 00 00 00 00 ff ff 00 00 01 00 00 00 01 00 ff ff 00 00 ... 4046 more bytes>
[GST] Got snapshot test00029.jpg time since last: 199ms. size
Recording 8192 bytes <Buffer 00 00 ff ff 00 00 00 00 00 00 00 00 01 00 01 00 ff ff ff ff 01 00 01 00 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... 8142 more bytes>
[GST] Got snapshot test00031.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer 00 00 01 00 00 00 00 00 01 00 ff ff ff ff 00 00 00 00 00 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... 4046 more bytes>
[GST] Got snapshot test00033.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff ff ff 00 00 ff ff 00 00 00 00 ff ff ... 4046 more bytes>
[GST] Got snapshot test00035.jpg time since last: 199ms. size
Recording 8192 bytes <Buffer 00 00 01 00 00 00 00 00 ff ff 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 ... 8142 more bytes>
[GST] Got snapshot test00037.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer ff ff 00 00 00 00 ff ff 00 00 ff ff 00 00 01 00 00 00 00 00 01 00 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 ... 4046 more bytes>
[GST] Got snapshot test00039.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 ff ff 00 00 00 00 01 00 00 00 ... 4046 more bytes>
Recording 8192 bytes <Buffer 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 01 00 00 00 00 00 ff ff 00 00 01 00 00 00 01 00 00 00 00 00 01 00 00 00 00 00 00 00 ff ff 01 00 ... 8142 more bytes>
[GST] Got snapshot test00041.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ... 4046 more bytes>
[GST] Got snapshot test00043.jpg time since last: 199ms. size
[SER] Recording audio...
Recording 4096 bytes <Buffer 01 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ff ff 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ... 4046 more bytes>
[GST] Got snapshot test00045.jpg time since last: 199ms. size
Recording 8192 bytes <Buffer 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 01 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00 ff ff 00 00 00 00 00 00 ... 8142 more bytes>
[GST] Got snapshot test00047.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 01 00 01 00 00 00 01 00 00 00 00 00 ... 4046 more bytes>
[GST] Got snapshot test00049.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 ff ff 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... 4046 more bytes>
[GST] Got snapshot test00051.jpg time since last: 199ms. size
Recording 8192 bytes <Buffer ff ff 01 00 00 00 00 00 01 00 00 00 00 00 01 00 00 00 01 00 00 00 00 00 ff ff 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 ff ff 00 00 00 00 01 00 00 00 ... 8142 more bytes>
Recording 4096 bytes <Buffer 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ff ff 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... 4046 more bytes>
[GST] Got snapshot test00053.jpg time since last: 200ms. size
Recording 4096 bytes <Buffer 00 00 00 00 00 00 ff ff 00 00 00 00 01 00 00 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 01 00 00 00 00 00 ... 4046 more bytes>
[GST] Got snapshot test00055.jpg time since last: 199ms. size
Recording 8192 bytes <Buffer 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 01 00 ... 8142 more bytes>
[GST] Got snapshot test00057.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer ff ff 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 01 00 ... 4046 more bytes>
[GST] Got snapshot test00059.jpg time since last: 199ms. size
Recording 4096 bytes <Buffer 00 00 00 00 ff ff 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ff ff 00 00 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff ... 4046 more bytes>
[GST] Got snapshot test00061.jpg time since last: 199ms. size
Recording 8192 bytes <Buffer 00 00 00 00 00 00 01 00 00 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 ff ff 00 00 01 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 01 00 ff ff ... 8142 more bytes>
[GST] Got snapshot test00063.jpg time since last: 200ms. size
Recording 4096 bytes <Buffer 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 01 00 01 00 00 00 ... 4046 more bytes>
Recording 2390 bytes <Buffer ff ff 00 00 ff ff 00 00 00 00 ff ff 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 01 00 01 00 ff ff ff ff 00 00 00 00 00 00 00 00 ff ff 00 00 ... 2340 more bytes>
End Recording: 4.018s
[SER] Uploading sample to https://ingestion.edgeimpulse.com/api/training/data...
[GST] Got snapshot test00065.jpg time since last: 200ms. size
[GST] Got snapshot test00067.jpg time since last: 199ms. size
[SER] Sampling finished
[GST] Got snapshot test00069.jpg time since last: 201ms. size
[GST] Got snapshot test00072.jpg time since last: 298ms. size
[GST] Got snapshot test00074.jpg time since last: 200ms. size
[GST] Got snapshot test00077.jpg time since last: 298ms. size
[GST] Got snapshot test00079.jpg time since last: 199ms. size
[GST] Got snapshot test00081.jpg time since last: 199ms. size
[GST] Got snapshot test00083.jpg time since last: 199ms. size
[GST] Got snapshot test00085.jpg time since last: 199ms. size
[GST] Got snapshot test00087.jpg time since last: 199ms. size
[GST] Got snapshot test00089.jpg time since last: 199ms. size
[GST] Got snapshot test00091.jpg time since last: 199ms. size
[GST] Got snapshot test00093.jpg time since last: 199ms. size
[GST] Got snapshot test00095.jpg time since last: 199ms. size
^C[SER] Received stop signal, stopping application... Press CTRL+C again to force quit.
[GST] handling interrupt.
Interrupt: Stopping pipeline ...
Execution ended after 0:00:09.803339636
Setting pipeline to PAUSED ...
Setting pipeline to READY ...

However, the issue with standalone local audio inferencing still persists.
Below is the output for various trials using “hw:0,0” and “plughw:PCH,0” arguments.

$ ./build/audio hw:0,0 --debug
Enabling debug mode
audio interface opened
hw_params allocated
hw_params initialized
hw_params access set
hw_params format set
cannot set sample rate (Invalid argument)

$ ./build/audio plughw:PCH,0 --debug
Enabling debug mode
audio interface opened
hw_params allocated
hw_params initialized
hw_params access set
hw_params format set
hw_params rate set: 16000
hw_params channels set:1
cannot set parameters (Cannot allocate memory)

Should there be some audio settings that need to be done prior?

@crisdeodates I’m not sure unfortunately. I’d only guess to try plughw:0,0 based on the /proc/asound/cards