Energies and Equilibrium Geometries: As a long time Spartan user, I have followed (with interest) the change in default computational models used for structure (Equilibrium Geometry) over the years. From HF/3-21G to HF/6-31G* to B3LYP/6-31G* to the current wB97X-D/6-31G*, as computers have improved in speed and memory resources, the once challenging has become in many cases, trivial. . .
There are some properties (like structure), where the default models are within the experimental error we are comparing to (in other words, one can't really tell how much better the models do they perform on par with experimental data). Other items, like energy, for example, benefit from newer generation functionals and larger basis sets.
In general, beginning with Spartan'24, my typical baseline computational workflow for new molecules is changing. After some six months or more working with our machine learning models (and after revisions and improvements all around), I am more inclined to embrace the neural network models (where available) and start there.
So, for example, if I encounter a new organic molecule that meets the requirements (comprised of H, C, N, O, F, S, Cl, and Br, closed shell and uncharged atoms (only)) - if the molecule is rigid, or if it appears to be at a good conformation, when I am after structure (Equilibrium Geometry), while in the past I would often start with an Equilibrium Geometry using wB97X-D/6-31G* (understanding that this would take some time, but typically less than 5 minutes) on a 24 core machine, like:
This typically provided a good starting point for a better energy calculations (for example wB97M-V/6-311+G(2pf,2p):
In practice BOTH of the above calculations would be submitted as a single job, like:
This Energy calculation (with the better functional and larger basis set) following the initial Equilibrium Geometry from wB97X-D/6-31G* might take something like 15 minutes or more. . . Depending on starting geometry (ours was an MMFF geometry) this is actually reasonably fast. Both the initial QM Equilibrium Geometry and the subsequent energy calculations can often be quite time consuming (combined often dozens of minutes to potentially hours). . .sad face.
Enter: Spartan'24: My current work flow more commonly uses the machine learning models for geometry or both geometry and energy, like this:
The time savings can be significant. In the case of a reasonable size molecule I have been working with (molecular weight 326.4 AMU), fairly rigid organic molecule - C20H24O4 (the IUPAC is something like: 15,17-dimethoxytricyclo[11.3.1.1⁵,⁹]octadeca-1(17),5,8,13,15-pentaene-7,18-dione), the file is available for download here. The QM results required >15 minutes while the neural network (NN) results require less than a minute. The difference in energy: .21 kJ/mol (less than half a kJ/mol)!
I was actually suspicious of the high quality agreement - and assumed that this molecule must have been in the neural network training set (it does exist in the SSPD). . . So to test this, I ran two additional comparisons, substituting either one or both of the 2 methoxy groups with methylsulfanyl groups.
New structures were minimized with MMFF (from the GUI) and the same jobs were submitted, comparing the calculated wB97M-V/6-311+G(2df,dp) energies from calculated wB97X-D/6-31G* geometries to the neural network estimated wB97M-V/6-311+G(2df,dp) energies from neural net estimated wB97X-D/6-31G* geometries, results follow:
1 methylsufanyl substitution: Time (min) Δ E (kJ/mol)
wB97M-V/6-311+G(2df,dp) 32.24 0.00
//wB97X-D/6-31G*
Est.wB97M-V/6-311+G(2df,dp) 00.73 1.50
//Est.wB97X-D/6-31G*
2 methylsufanyl substitutions:
wB97M-V/6-311+G(2df,dp) 32.97 0.00
//wB97X-D/6-31G*
Est.wB97M-V/6-311+G(2df,dp) 00.71 2.90
//Est.wB97X-D/6-31G*
Over a sampling of > 300 molecules (original structures taken from Cambridge Structural Database) RMS error for this neural network model is on the order of .76 kJ/mol. While I don't have an average time savings, in the three similarly sized molecules compared above, the time savings afforded using the neural network models is on the order of a factor of 20-45!!!.
So do NOT be afraid to test out the new models - there is the potential for substantial time savings.