CIESC Journal ›› 2025, Vol. 76 ›› Issue (3): 1084-1092.DOI: 10.11949/0438-1157.20240849

• Process system engineering • Previous Articles     Next Articles

Lattice energy regression model based on crystal graph convolutional neural networks

Xinyu ZHENG(), Zehua REN, Li ZHOU, Shiyang CHAI(), Xu JI   

  1. School of Chemical Engineering, Sichuan University, Chengdu 610000, Sichuan, China
  • Received:2024-07-26 Revised:2024-09-18 Online:2025-03-28 Published:2025-03-25
  • Contact: Shiyang CHAI

基于晶体图卷积神经网络的晶格能回归模型

郑欣雨(), 任泽华, 周利, 柴士阳(), 吉旭   

  1. 四川大学化学工程学院,四川 成都 610000
  • 通讯作者: 柴士阳
  • 作者简介:郑欣雨(2001—),女,硕士研究生,xinyuzheng_sc@163.com
  • 基金资助:
    国家自然科学基金项目(22308228)

Abstract:

The lattice energy is a critical physical property determining the thermodynamic stability of crystals and holds instructive significance in screening the stability of polymorphism. Lattice energy is usually obtained by experimental trial and error as well as theoretical calculation based on molecular/quantum mechanics. For a large number of crystal structures, both methods are time-consuming and laborious. In this paper, a lattice energy regression model based on density functional theory (DFT) and crystal graph convolutional neural networks (CGCNN) is proposed. First, the lattice energies were calculated using the range-separated self-consistent screened many-body dispersion corrected DFT method. A dataset comprising lattice energies for 248 crystal structures including acids, alcohols, amides, amino acids, and anhydrides was established. Subsequently, leveraging this dataset, a crystal graph convolutional neural networks model was employed to establish a quantitative regression model for the relationship between crystal structures and lattice energies, which demonstrated promising predictive performance with mean absolute percentage error (MAPE) values of 1.24% for the training set and 5.04% for the test set, and R2 values of 0.9978 and 0.9750, respectively. The results show that the model has a good predictive performance which can provide theoretical guidance and technical support for high-throughput screening of stable crystal forms.

Key words: lattice energy, polymorphism, density function theory, neural networks, regression model

摘要:

晶格能是决定晶体热力学稳定性的关键物理性质,对药物多晶型稳定性的筛选具有指导意义。晶格能的获取方式通常为实验试错和基于分子/量子力学的理论计算,对于数量庞大的晶型结构,两种方法均费时费力。提出一种基于密度泛函理论(density functional theory,DFT)和晶体图卷积神经网络(crystal graph convolutional neural networks,CGCNN)的晶格能回归模型。首先采用自洽屏蔽多体色散校正的DFT方法计算晶格能,建立包含酸、醇、酰胺、氨基酸、酸酐等248种晶型的晶格能数据集;基于所建立的数据集,采用CGCNN进一步建立晶型和晶格能之间的定量回归模型,该模型训练集和测试集的MAPE分别为1.24%和5.04%,R2分别为0.9978和0.9750,表明该模型具有较好的预测效果,可以为高通量筛选稳定的晶型提供理论指导。

关键词: 晶格能, 多晶型, 密度泛函理论, 神经网络, 回归模型

CLC Number: