Some Thoughts on Splitting Chemical Datasets
Practical Cheminformatics
NOVEMBER 16, 2024
Introduction Dataset splitting is one topic that doesn’t get enough attention when discussing machine learning (ML) in drug discovery. The data is typically divided into training and test sets when developing and evaluating an ML model. The model is trained on the training set, and its performance is assessed on the test set. If hyperparameter tuning is required, a validation set is also included.
Let's personalize your content