Split / perform dataset

I would like to ask if I can change the dataset partition to 70 for training and 30 for testing ,and if yes how ?

Hi @Bayan_khalid,

The “perform train/test split” automatically assumes 80/20, so that won’t work for your case.

You can select individual samples in your training set and select “move to test set.”

What I do sometimes is download all of my data and write a quick Python script to reshuffle to my desired ratio (e.g. 70/30) and then re-upload.

Assuming you are not doing FOMO or otherwise have a bounding_boxes.labels file then in Python:

from sklearn.model_selection import train_test_split
# Split all files into "Train" and "Validation" datasets, 70:30 split.
train_images, val_images,  train_labels, val_labels  = train_test_split(images, labels,  test_size = 0.3, random_state = 42)

Ignore the {train_labels, val_labels } lists.

The {train_images, val_images} lists now hold the filenames you need to copy to separate folders, then upload the separate folder to the Studio.