Getting floating points labels from file name

marvin · September 17, 2021, 7:27am

I uploaded my files as follows. you will notice the file name begins with a float number before the “csv” extension. The issue is when i upload to edge impulse it grabs only the first number before the period. for example, a file “2.4.csv” the label on edge impulse will be 2. but i want the label to be 2.4. I have to manually edit all labels with the bulk of my data it is quite a heavy task. please help fix this
@janjongboom

marvin · September 17, 2021, 7:36am

instead of reading up to a period ‘.’ you can enable it to accept “filename.endswith(”.csv")"

janjongboom · September 17, 2021, 8:05am

@marvin Yeah the hard part here is how you deal with duplicate values (which we do often), e.g. you have 0.3 twice in your regression dataset; but let me file a bug and think about this a little.

marvin · September 17, 2021, 8:18am

Sure that will be helpful. but you will see the 0.3 has different values in the columns from each other for the two separate files

janjongboom · September 17, 2021, 1:13pm

Actually easy way to fix @marvin:

edge-impulse-uploader --label "0.3" 0.3.csv
edge-impulse-uploader --label "0.5" 0.5.csv
edge-impulse-uploader --label "0.7" 0.7.csv

Throw it in a script, and run the script. Easier than fixing by hand

jenny · September 17, 2021, 1:16pm

Further edge-impulse-uploader documentation: https://docs.edgeimpulse.com/docs/cli-uploader#uploading-via-the-cli (you also need --api-key for your project, and you can specify whether the data should be upload to training, testing or split automatically)

Here’s a command I have used in the past on my Mac to automatically upload every file in the current directory with a specified label, for example: find . -type f | xargs -Irepl edge-impulse-uploader --category split --label "0.3" --api-key ei_xxx repl

marvin · September 21, 2021, 10:08am

Hi. Sorry for the late reply. This worked or me. @jenny that would not work for me i have so many files with different labels so i need to grab everything and upload to edge impulse. @janjongboom maybe not such a furnished script but it worked magic for me. thanks

import pandas as pd
import os
directory ='//home/marvin/Joseph_files'
print(os.listdir(directory))
for filename in os.listdir(directory):
	if filename.endswith(".csv") :
		
		lbl=df['Time_Opened'].values
		lbl=str(lbl)[1:-1]
		df.to_csv(filename,index=False)
		os.system("edge-impulse-uploader --label """+lbl+" "+filename+"")

janjongboom · November 30, 2022, 9:53am

FYI we now support passing in an external file with this info.

files.json

{
    "version": 1,
    "files": [{
        "path": "path1.csv",
        "category": "split",
        "label": { "type": "label", "label": "0.5" },
        "metadata": {
            "site_collected": "Amsterdam"
        }
    }, {
        "path": "path2.csv",
        "category": "split",
        "label": { "type": "label", "label": "0.7" },
        "metadata": {
            "site_collected": "Paris"
        }
    }]
}

Then upload via edge-impulse-uploader --info-file files.json.

Some notes:

Metadata field is optional, you can omit it.
To upload unlabeled data use "label": { "type": "unlabeled" }