Training - Early Finish (to a Working Model)

MMarcial · October 3, 2022, 5:29am

When I am watching a Model being trained and see that the Loss is low and acceptable and the F1 Score is high and acceptable I would like to Finish the training. The only option is to Cancel, set the # of training cycles and start Training again to get to a finished completely trained model. A Finish Now button would stop the training and deliver a working model. This Finish Now button would also reduce EI’s $600k AWS bill as Jan mentioned at the recent conference.

Of course I need not watch the Model being trained. I would like to see early-training-exit goals be settable. For example, if certain goals are met, then the model training would exit early and deliver a trained model. Example goals would work like an if statement: F1 > 95% && (trnLoss < 0.001 || valLoss < 0.0015).

shawn_edgeimpulse · October 4, 2022, 3:10pm

Hi @MMarcial,

That sounds like a pretty nice idea. I will submit it as a feature request to our development team. Thanks!

louis · October 5, 2022, 8:12am

Hello @MMarcial,

While the feature request is reviewed and implemented, you can add early stopping using the expert mode and use the Keras callback.

Please, see EarlyStopping

Best,

Louis

matkelcey · October 5, 2022, 8:30pm

Hey @MMarcial , thanks for the feedback.

Further to what the other say I’m curious what scenario you have where you don’t want training to continue a bit longer with the chance of a better model? Is the training step time critical for you?

note1: it’s always better to define a good model with respect to your eval, e.g. F1, not the raw loss. the loss value is unit less and subject to scale change with respect to the optimisation in a way that ONLY guarantees a monotonic decreasing value over the training time. if you peg to an absolute number it’ll be unstable as other things change.

note2: loss tracking down implying F1 tracking down isn’t always the case. we have scenarios where the F1 score is calculated via a function that includes components that aren’t covered by the optimisation loop which means F1 can go UP before it goes DOWN even while the loss continues to drop.

note3: though we don’t have early stopping, we do restore the best model observed during training, which isn’t always the last epoch (due to overfitting). due to note1 and note2 and lots of other diversity in the types of models people train, and the fact these models are quite small and quick to have turnaround on, just restoring the best model gives us the biggest bang for buck.

Cheers, Mat

MMarcial · October 12, 2022, 3:37am

@matkelcey Short answer is yes, during my model intuit iterating method.

TL;DR → When I deploy a model for production I will allow the model to run for as long as possible given project time constraints and computing real dollar costs. However, when tweaking a Model I will run thru many, many iterations of adding data, removing data, adding new classes of data, tweaking the Keras code, trying different things to get inference time down or memory down, etc. I just want a Model that is good enuf that I can deploy and check overall performance. Sure if I get a inference of 30% accuracy, then I can deduce that is good enuf because the Model is not well trained. But at this stage of iterating I can intuit what 30% might lead me too means versus say an inference with 2% probability.