Data directory structure for custom learning block & deployment block

Hello!

When working with custom deployment blocks, the layout of the /data directory mentioned on docs-dot-edgeimpulse-dot-com/studio/organizations/custom-blocks/custom-learning-blocks does not seem to align with reality.

Claimed:

data/
├── X_split_test.npy
├── X_split_train.npy
├── Y_split_test.npy
├── Y_split_train.npy
└── sample_id_details.json

Reality:

data/
├── X_classify_features.npy
├── X_train_samples.npy
├── X_train_features.npy
├── X_train_data_explorer.npy
├── y_samples.npy
├── y_train.npy
├── y_classify.npy
└── sample_id_details.json

As far as I can tell, some names have simply changed (“split_test”, “split_train” => replaced with simply “classify” and “train”). However, I’m not so sure about the samples vs. features part though (something to do with (pre)processing blocks?). Some clarification would be nice.

Hi @jiahsong - thanks for pointing this out. I will take a look and update our documentation if it is out of date.

I’ve ran some tests and I need some more information from you.

How are you getting to the data directory structure that you posted above under Reality:?

I used the Edge Impulse CLI (edge-impulse-blocks runner --download-data) to download the data for classification (time-series, audio, and image) projects and an object detection project. The files downloaded to the data directory match what is shown in the Claimed: example above for each project. These are the inputs that are used to run a custom learning block locally and what the docs were written off of.

However, as you are pointing out, perhaps what is actually passed to the learning block during training in Studio differs from what is downloaded and used with the CLI.