Inactive-enriched machine-learning models exploiting patent data improve structure-based virtual screening for PDL1 dimerizers

DOI10.5281/zenodo.6226320ZenodoMaRDI QIDQ6722654FDO

Authors Sachin Patil, Pablo Gómez-Sacristán, Saw Simeon, Pedro J. Ballester, Viet-Khoa Tran-Nguyen

Publication date 22 February 2022

The 12 VS scenarios considered in this study employing six training-test data partitions (A-F). All training sets employ the same set of 371 actives (WO2015160641A2), but differ on the considered set of inactives and hence are uniquely identified by the latter (either TrueInactives, DeepCoys, RandomDecoys or ActivesOnly). Likewise, all test sets employ the same 297 actives (WO201503820A1), none of them also included in the training set, but different sets of inactives (TrueInactives or DeepCoys). Table 1. Six virtual screening scenarios corresponding to six pairs of training-test data for each type of SFs (classification or regression) Partition ID Training set Test set Type A DeepCoys TrueInactives Classification B RandomDecoys TrueInactives Classification C ActivesOnly TrueInactives Classification D TrueInactives DeepCoys Classification E RandomDecoys DeepCoys Classification F ActivesOnly DeepCoys Classification A DeepCoys TrueInactives Regression B RandomDecoys TrueInactives Regression C ActivesOnly TrueInactives Regression D TrueInactives DeepCoys Regression E RandomDecoys DeepCoys Regression F ActivesOnly DeepCoys Regression

This page was built for dataset: Inactive-enriched machine-learning models exploiting patent data improve structure-based virtual screening for PDL1 dimerizers