Dataset for "ConfSolv: Prediction of solute conformer free energies across a range of solvents"
DOI10.5281/zenodo.10041210Zenodo10041210MaRDI QIDQ6693428FDOQ6693428
Dataset published at Zenodo repository.
William H. Green, Philipp Eiden, Kevin A. Spiekermann, Frederik Sandfort, Angiras Menon, Lagnajit Pattanaik, Florence H. Vermeire, Zipei Tan, Volker Settels
Publication date: 25 October 2023
Copyright license: Creative Commons Attribution 4.0 International
This dataset contains three archives. The first archive, full_dataset.zip, containsgeometries and free energiesfor nearly 44,000 solute molecules with almost 9 million conformers, in 42 different solvents. The geometries and gas phase free energies are computed using density functional theory (DFT). The solvation free energy for each conformer is computedusingCOSMO-RS and the solution free energies are computed using the sum of the gas phase free energies and the solvation free energies. The geometries for each solute conformer are provided as ASE_atoms_objects within a pandas DataFrame, found in the compressed filedft coords.pkl.gz within full_dataset.zip. The gas-phase energies, solvation free energies, and solution free energies are also provided as a pandas DataFrame in the compressed file free_energy.pkl.gz within full_dataset.zip. Ten example data splits for both random and scaffold split types are also provided in the ZIP archivefortraining models. Scaffold split index 0 is used to generate results in the corresponding publication.The second archive, refined_conf_search.zip, contains geometries and free energies for a representative sample of 28 solute molecules from the full dataset that were subject to a refined conformer search and thus had more conformers located. The format of the data is identical to full_dataset.zip.The third archive contains one folder for each solvent for which we have provided free energies in full_dataset.zip. Each folder contains the .cosmo file for every solvent conformer used in the COSMOtherm calculations, a dummy input file for the COSMOtherm calculations, and a CSV file that contains the electronic energy of each solvent conformer that needs to be substituted for "EH_Line" in the dummy input file.
This page was built for dataset: Dataset for "ConfSolv: Prediction of solute conformer free energies across a range of solvents"