Underflow Error

tyeth · June 17, 2023, 11:28pm

Question/Issue:
I uploaded some data, I finally got my data converted to regular sample times (stupid issue, you should offer interpolation of non-regular data if the user desires).
I saw the data graph (its a load of particulate data, with non-regular sampling and off periouds, 13megabytes).
I then went and chose … menu → Split data
I then got to a screen with an error message. Now I can no longer visit that project (any page of it).

Project ID: 253
https://studio.edgeimpulse.com/studio/253

Context/Use case:
An error occurred when serving your request: value out of range: underflow

This is the csv:
https://gundryconsultancy.com/csvs/data_interp%20(2).csv

MMarcial · June 17, 2023, 11:55pm

The CSV data suggests you are taking a sample every 50 million years. What kinda project is this? What is the goal?

MMarcial · June 18, 2023, 7:22pm

It looks like you are collecting time-series data at regular intervals. Then evey now and then it looks like data acq pauses for a while. This pause is creating a non-periodic timestamp in your data. You need to write each time-series to a separate CSV file and to make things easier follow the CSV filename naming convention mentioned here so your data gets automagically labeled.

tyeth · June 29, 2023, 3:17pm

I just converted the string timestamp to float, so its probably ticks in ms.
I couldn’t work out a way of reliably splitting the data, as its one csv with semi regular but irregular samples, plus large off periods.

Biggest issue is it has blown up my project, just reports error continuously. I can’t get to the project page to remove the bad dataset. Who’s best to ping for system bugs?

MMarcial · June 30, 2023, 5:44am

RE: I just converted the string timestamp to float, so its probably ticks in ms.

Line 2 of the CSV file look like:
- 1683381388000000000,5.0032258064516135
  Are you saying 1683381388000000000 is a float?

RE: I couldn’t work out a way of reliably splitting the data, as its one csv with semi regular but irregular samples, plus large off periods.

I would write a parser that looks for the large duration in the timestamp field
The parser starts writing the raw CSV to a file until it comes across the large duration
Then it saves the file with the filename indicating the label for what this time series represents
The parser then resume writing to a new file looking for the large duration, etc. until all time series have been written to individual files.
You then upload all the individual CSVs, that represent a Sample, aka a time-series of data.

RE: Biggest issue is it has blown up my project, just reports error continuously. I can’t get to the project page to remove the bad dataset.

It’s blowing up because the timestamp is huge. Like a sample (each row) is taken every 50 million years.

RE: Who’s best to ping for system bugs?

I do not think this is a bug.
I was able to use the Studio CSV Wizard to import the data.
I told the wizard assume each row represents a 10 ms duration.
I also told the wizard to split the data into 3 second samples.
You now have 1311 samples that you will need to label.

janjongboom · June 30, 2023, 4:36pm

@tyeth Core engineering team is looking at it. The underlying issue is that the interval between samples is so large that it (after a few calculations) it gets saved as Infinity - that causes issues down the line leading to the corrupt state in the project. We’ll fix the project and the underlying issue.

Re-importing into a new project (https://studio.edgeimpulse.com/studio/profile/projects) with a lower interval will work as @MMarcial mentioned - if you need a quick fix.