JS heap out of memory when using large files

So I have some audio data I’m trying to process which is split into multiple files. Some of these files are small and do not give any problems. But others are larger and when I tried to upload files of 128 mb, the uploader crashes with a JS heap out of memory. So I split the files in two, and now the uploading works (slowly, but that could be expected).
Next, I try to generate the features, but this fails with the following error:

[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
WARN: failed to process 4515/training/59-Noise-3-b.wav.1e6q6qsb.ingestion-d6bc5ff48-kms5l.json: Worker terminated due to reaching memory limit: JS heap out of memory
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
[ 40/240] Creating windows from files...
WARN: failed to process 4515/training/59-Noise-b.wav.1e6q6i5u.ingestion-d6bc5ff48-5nn7n.json: Worker terminated due to reaching memory limit: JS heap out of memory

This pattern repeats about eight times (I have eight of these ~64mb files in the training set and similar for the test set) until it detects I’m out of computing time.

So my question is: is there any limit on the file size? If so, what is that limit? It would be nice if this was enforced by the code/user interface somehow so no time is wasted on crashing processes. If not, what is going on?

@bmulder I think the number of windows that are created currently is the issue. Could you up the window increase to e.g. 500 ms. (I think the project you have has this set at 250 ms.)?

Will also look at the memory usage when windowing.

But others are larger and when I tried to upload files of 128 mb, the uploader crashes with a JS heap out of memory. So I split the files in two, and now the uploading works (slowly, but that could be expected).

We’ll be upping the memory on ingestion somewhere today to mitigate this! Saw some errors in the logs yesterday.

@janjongboom A problem with setting the window increase to 500ms is that some of my files are just over half a second long. Those files contain short sounds like a single thump which I also want to identify in the classifier. While others are tens of minutes long (mostly containing noise).
Any suggestions on how to process this?

with setting the window increase to 500ms is that some of my files are just over half a second long.

This is fine, in that case only a single window is emitted anyway (already the case).

My feeling now is that your noise samples now generate so many noise samples that we run out of memory. This is also bad for dataset balance (many more noise samples than the sound you want to classify).

All right, thanks for the suggestion, I will try with some less noise data.

@bmulder Not necessarily with less noise data, just set the window increase higher so there’s less overlap.