化工学报 ›› 2020, Vol. 71 ›› Issue (10): 4462-4472.DOI: 10.11949/0438-1157.20200814
收稿日期:
2020-06-22
修回日期:
2020-07-24
出版日期:
2020-10-05
发布日期:
2020-10-05
通讯作者:
申威峰
作者简介:
田璐瑶(1996—),女,硕士研究生,基金资助:
Luyao TIAN(),Zihao WANG,Yang SU,Huaqiang WEN,Weifeng SHEN()
Received:
2020-06-22
Revised:
2020-07-24
Online:
2020-10-05
Published:
2020-10-05
Contact:
Weifeng SHEN
摘要:
定量构效关系是溶剂分子设计与开发的重要理论基础,建立准确可靠的预测模型可以有效地解决性质数据库资源有限、实验过程人力物力消耗量大且具有危险性等问题。随着人工智能技术的快速发展,深度学习在化工领域取得突破性进展,基于此,综述了经典与智能化建模的研究理论与方法,重点介绍了基于深度学习实现大规模数据智能化关联的研究进展,进一步阐述了深度学习在有机物各种基础物性和环境健康安全等潜在影响性质预测中的潜力与优势,并从溶剂设计向绿色、安全、智能化发展的角度,展望了基于深度学习的定量构效关系在化学产品开发与化工过程设计等方面的理论研究方向和应用前景。
中图分类号:
田璐瑶, 王梓豪, 粟杨, 文华强, 申威峰. 基于深度学习的溶剂定量构效关系建模研究进展[J]. 化工学报, 2020, 71(10): 4462-4472.
Luyao TIAN, Zihao WANG, Yang SU, Huaqiang WEN, Weifeng SHEN. Research advances in deep learning based quantitative structure-property relationship modeling of solvents[J]. CIESC Journal, 2020, 71(10): 4462-4472.
方法 | 研究对象 | 文献 |
---|---|---|
深度信念网络(DBN) | 抗HIV活性 | [ |
递归神经网络(RNN) | 药物分子的水溶性 | [ |
卷积神经网络(CNN) | 毒性、活性和溶剂化性质 | [ |
长短期记忆-卷积神经 网络(LSTM-CNN) | 药物分子的毒性和 活性 | [ |
表1 基于深度学习的定量构效关系研究
Table 1 Studies of deep learning based quantitative structure-property relationship
方法 | 研究对象 | 文献 |
---|---|---|
深度信念网络(DBN) | 抗HIV活性 | [ |
递归神经网络(RNN) | 药物分子的水溶性 | [ |
卷积神经网络(CNN) | 毒性、活性和溶剂化性质 | [ |
长短期记忆-卷积神经 网络(LSTM-CNN) | 药物分子的毒性和 活性 | [ |
1 | Stephanopoulos G, Reklaitis G V. Process systems engineering: from solvay to modern bio- and nanotechnology. A history of development, successes and prospects for the future[J]. Chem. Eng. Sci., 2011, 66(19): 4272-4306. |
2 | Sun S, Lü L, Yang A, et al. Extractive distillation: advances in conceptual design, solvent selection, and separation strategies[J]. Chin. J. Chem. Eng., 2019, 27(6): 1247-1256. |
3 | Austin N D, Sahinidis N V, Trahan D W. Computer-aided molecular design: an introduction and review of tools, applications, and solution techniques[J]. Chem. Eng. Res. Des., 2016, 116: 2-26. |
4 | Prat D, Hayler J, Wells A. A survey of solvent selection guides[J]. Green Chem., 2014, 16(10): 4546-4551. |
5 | Zhou T, Song Z, Zhang X, et al. Optimal solvent design for extractive distillation processes: a multiobjective optimization-based hierarchical framework[J]. Ind. Eng. Chem. Res., 2019, 58(15): 5777-5786. |
6 | Su Y, Wang Z, Jin S, et al. An architecture of deep learning in QSPR modeling for the prediction of critical properties using molecular signatures[J]. AIChE J., 2019, 65: e16678. |
7 | Zhou T, Song Z, Sundmacher K. Big data creates new opportunities for materials research: a review on methods and applications of machine learning for materials design[J]. Engineering, 2019, 5(6): 1017-1026. |
8 | Wenzel J, Matter H, Schmidt F. Predictive multitask deep neural network models for ADME-Tox properties: learning from large data sets[J]. J. Chem. Inf. Model., 2019, 59(3): 1253-1268. |
9 | Papadopoulos A, Linke P. Multiobjective molecular design for integrated process-solvent systems synthesis[J]. AIChE J., 2006, 52(3): 1057-1069. |
10 | Roy K, Kar S, Das R N. A Primer on QSAR/QSPR Modeling[M]. Berlin: Springer, 2015: 37-57. |
11 | Shen W, Benyounes H, Song J. A review of ternary azeotropic mixtures advanced separation strategies[J]. Theor. Found. Chem. Eng., 2016, 50(1): 28-40. |
12 | Yang A, Zou H, Chien I L, et al. Optimal design and effective control of triple-column extractive distillation for separating ethyl acetate/ethanol/water with multiazeotrope[J]. Ind. Eng. Chem. Res., 2019, 58(17): 7265-7283. |
13 | Yang A, Shen W, Wei S, et al. Design and control of pressure-swing distillation for separating ternary systems with three binary minimum azeotropes[J]. AIChE J., 2019, 65(4): 1281-1293. |
14 | Katritzky A R, Lobanov V S, Karelson M. QSPR: the correlation and quantitative prediction of chemical and physical properties from structure[J]. Chem. Soc. Rev., 1995, 24(4): 279-287. |
15 | Shen W, Dong L, Li J, et al. Systematic design of an extractive distillation for maximum‐boiling azeotropes with heavy entrainers[J]. AIChE J., 2015, 61(11): 3898-3910. |
16 | Hu Y, Su Y, Jin S, et al. Systematic approach for screening organic and ionic liquid solvents in homogeneous extractive distillation exemplified by the tert-butanol dehydration[J]. Sep. Purif. Technol., 2019, 211: 723-737. |
17 | Wang Z, Su Y, Jin S, et al. A novel unambiguous strategy of molecular feature extraction in machine learning assisted predictive models for environmental properties[J]. Green Chem., 2020, 22(12): 3867-3876. |
18 | Dehmer M, Varmuza K, Bonche V D, et al. Statistical modelling of molecular descriptors in QSAR/QSPR[M]. New Jersey: Wiley-Blackwell, 2012: 3-137. |
19 | Jhamb S, Liang X, Gani R, et al. Estimation of physical properties of amino acids by group-contribution method[J]. Chem. Eng. Sci., 2018, 175: 148-161. |
20 | Gmehling J, Anderson F, Prausnitz M. Solid-liquid equilibria using UNIFAC[J]. Ind. Eng. Chem. Fund., 1978, 17(4): 269-273. |
21 | Joback K G, Reid R C. Estimation of pure-component properties from group-contributions[J]. Chem. Eng. Commun., 1987, 57(1-6): 233-43. |
22 | Frutiger J, Marcarie C, Abildskov J, et al. Group-contribution based property estimation and uncertainty analysis for flammability-related properties[J]. J. Hazard. Mater., 2016, 318: 783-793. |
23 | Hukkerikar A S, Sarup B, Kate A T, et al. Group-contribution+ (GC+) based estimation of properties of pure components: improved property estimation and uncertainty analysis[J]. Fluid Phase Equilib., 2012, 321: 25-43. |
24 | Marrero J, Gani R. Group-contribution based estimation of pure component properties[J]. Fluid Phase Equilib., 2001, 183/184: 183-208. |
25 | Gani R, Harper P M, Hostrup M. Automatic creation of missing groups through connectivity index for pure-component property prediction[J]. Ind. Eng. Chem. Res., 2005, 44(18): 7262-7269. |
26 | Patel S J, Ng D, Mannan M S. QSPR flash point prediction of solvents using topological indices for application in computer aided molecular design[J]. Ind. Eng. Chem. Res., 2009, 48(15): 7378-7387. |
27 | Herring R H, Namikis R, Chemmangattuvalappil N G, et al. Molecular design using three-dimensional signature descriptors[J]. Comput. Aided Chem. Eng., 2012, 31: 225-229. |
28 | Wiener H. Structural determination of paraffin boiling points[J]. J. Am. Chem. Soc., 1947, 69(1): 17-20. |
29 | Randic M. On characterization of molecular branching[J]. J. Am. Chem. Soc., 1975, 97(23): 6609-6615. |
30 | Ivanciuc O, Balaban T S, Balaban A T. Design of topological indices. Part 4. Reciprocal distance matrix, related local vertex invariants and topological indices[J]. J. Math. Chem., 1993, 12(1): 309-318. |
31 | Sheridan R P, Miller M D. A method for visualizing recurrent topological substructures in sets of active molecules[J]. J. Chem. Inf. Comput. Sci., 1998, 38(5): 915-924. |
32 | Faulon J L, Visco D P, Pophale R S. The signature molecular descriptor. 1. Using extended valence sequences in QSAR and QSPR studies[J]. J. Chem. Inf. Comput. Sci., 2003, 43(3): 707-720. |
33 | Weis D C, Visco D P. Computer-aided molecular design using the signature molecular descriptor: application to solvent selection[J]. Comput. Chem. Eng., 2010, 34(7): 1018-1029. |
34 | Chen J, Visco Jr. D P. Developing an in silico pipeline for faster drug candidate discovery: virtual high throughput screening with the signature molecular descriptor using support vector machine models[J]. Chem. Eng. Sci., 2017, 159: 31-42. |
35 | Bagheri M, Rajabi M, Mirbagheri M, et al. BPSO-MLR and ANFIS based modeling of lower flammability limit[J]. J. Loss Prev. Process Ind., 2012, 25(2): 373-382. |
36 | Pan Y, Jiang J, Wang R, et al. Prediction of the upper flammability limits of organic compounds from molecular structures[J]. Ind. Eng. Chem. Res., 2009, 48(10): 5064-5069. |
37 | 熊焰, 丁靖, 虞大红, 等. 质量连接性指数改进的基团贡献法预测咪唑类离子液体的熔点[J]. 化工学报, 2011, 62(12): 3316-3322. |
Xiong Y, Ding J, Yu D H, et al. Predicting melting point of imidazolium-based ionic liquids using modified group-contribution by mass connectivity index[J]. CIESC Journal, 2011, 62(12): 3316-3322. | |
38 | Suzuki T. Quantitative structure-property relationships for auto‐ignition temperatures of organic compounds[J]. Fire Mater., 1994, 18(2): 81-88. |
39 | Rowley J R, Rowley R L, Wilding W V. Estimation of the lower flammability limit of organic compounds as a function of temperature[J]. J. Hazard. Mater., 2011, 186(1): 551-557. |
40 | Mitchell J B O. Machine learning methods in chemoinformatics[J]. WIRES Comput. Mol. Sci., 2014, 4(5): 468-481. |
41 | Breindl A, Beck B, Clark T, et al. Prediction of the n-octanol/water partition coefficient, logP, using a combination of semiempirical MO-calculations and a neural network[J]. J. Mol. Model., 1997, 3(3): 142-155. |
42 | Pan Y, Jiang J, Wang Z. Quantitative structure-property relationship studies for predicting flash points of alkanes using group bond contribution method with back-propagation neural network[J]. J. Hazard. Mater., 2007, 147(1/2): 424-430. |
43 | Dauphin Y N, de Vries H, Bengio Y. Equilibrated adaptive learning rates for non-convex optimization[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. MIT Press, 2015: 1504-1512. |
44 | Albahri T A, George R S. Artificial neural network investigation of the structural group contribution method for predicting pure components auto ignition temperature[J]. Ind. Eng. Chem. Res., 2003, 42(22): 5708-5714. |
45 | Eslamimanesh A, Gharagheizi F, Mohammadi A H, et al. Artificial neural network modeling of solubility of supercritical carbon dioxide in 24 commonly used ionic liquids[J]. Chem. Eng. Sci., 2011, 66(13): 3039-3044. |
46 | Gharagheizi F, Eslamimanesh A, Mohammadi A H, et al. Artificial neural network modeling of solubilities of 21 commonly used industrial solid compounds in supercritical carbon dioxide[J]. Ind. Eng. Chem. Res., 2010, 50(1): 221-226. |
47 | Pan Y, Jiang J, Wang R, et al. A novel QSPR model for prediction of lower flammability limits of organic compounds based on support vector machine[J]. J. Hazard. Mater., 2009, 168(2/3): 962-969. |
48 | Pan Y, Jiang J, Wang R, et al. Quantitative structure-property relationship studies for predicting flash points of organic compounds using support vector machines[J]. QSAR Comb. Sci., 2008, 27(8): 1013-1019. |
49 | He P, Pan Y, Jiang J. Prediction of the self-accelerating decomposition temperature of organic peroxides based on support vector machine[J]. Procedia Eng., 2018, 211: 215-225. |
50 | Saldana D A, Starck L, Mougin P, et al. On the rational formulation of alternative fuels: melting point and net heat of combustion predictions for fuel compounds using machine learning methods[J]. SAR QSAR Environ. Res., 2013, 24(4): 259-277. |
51 | Liu P, Long W. Current mathematical methods used in QSAR/QSPR studies[J]. Int. J. Mol. Sci., 2009, 10(5): 1978-1998. |
52 | 李伟, 杨金才, 黄牛. 深度学习在药物设计与发现中的应用[J]. 药学学报, 2019, 54(5): 15-21. |
Li W, Yang J C, Huang N. Deep learning in drug design and discovery[J]. Acta Pharm. Sin., 2019, 54(5): 15-21. | |
53 | 高双印, 田生伟, 禹龙, 等. 基于深度学习的抗HIV活性QSAR预测[J]. 计算机工程与设计, 2017, 38(1): 226-230. |
Gao S Y, Tian S W, Yu L, et al. Prediction of QSAR study of anti-HIV activaty based on deep learning[J]. Comput. Eng. Des., 2017, 38(1): 226-230. | |
54 | Lusci A, Pollastri G, Baldi P. Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules[J]. J. Chem. Inf. Comput. Sci., 2013, 53(7): 1563-1575. |
55 | Goh G B, Siegel C, Vishnu A, et al. Chemception: a deep neural network with minimal chemistry knowledge matches the performance of expert-developed QSAR/QSPR models[J]. Arxiv Preprint, 2017: 1706.06689. |
56 | Altae-Tran H, Ramsundar B, Pappu A S, et al. Low data drug discovery with one-shot learning[J]. ACS Cent. Sci., 2017, 3(4): 283-293. |
57 | Constantinou L, Gani R. New group contribution method for estimating properties of pure compounds[J]. AIChE J., 1994, 40(10): 1697-1710. |
58 | Wessel M D, Jurs P C. Prediction of normal boiling points for a diverse set of industrially important organic compounds from molecular structure[J]. J. Chem. Inf. Comp. Sci., 1995, 35(5): 841-850. |
59 | Hall L H, Story C T. Boiling point and critical temperature of a heterogeneous data set: QSAR with atom type electro-topological state indices using artificial neural networks[J]. J. Chem. Inf. Model., 1996, 36(5): 1004-1014. |
60 | Lim H, Jung Y. Delfos: deep learning model for prediction of solvation free energies in generic organic solvents[J]. Chem. Sci., 2019, 10(36): 8306-8315. |
61 | Paster I, Shachan M, Brauner N. Adjustable QSPRs for prediction of properties of long-chain substances[J]. AIChE J., 2011, 57(2): 423-433. |
62 | Saldana D A, Starck L, Mougin P, et al. Prediction of density and viscosity of biofuel compounds using machine learning methods[J]. Energy Fuels, 2012, 26(4): 2416-2426. |
63 | Gamidi R K, Rasmuson A C. Estimation of melting temperature of molecular cocrystals using artificial neural network model[J]. Cryst. Growth Des., 2016, 17(1): 175-182. |
64 | Machatha S G, Yalkowsky S H. Comparison of the octanol/water partition coefficients calculated by ClogP ACDlogP and KowWin to experimentally determined values[J]. Int. J. Pharm., 2005, 294(1/2): 185-192. |
65 | Chao H P, Lee J F, Chiou C T. Determination of the Henrys law constants of low-volatility compounds via the measured air-phase transfer coefficients[J]. Water Res., 2017, 120: 238-244. |
66 | Wang Z, Man Y, Hu Y, et al. Deep learning based dynamic COD prediction model for urban sewage[J]. Environ. Sci.: Water Res. Technol., 2019, 5: 2210-2218. |
67 | Wang Z, Su Y, Shen W, et al. Predictive deep learning models for environmental properties: the direct calculation of octanol-water partition coefficients from molecular graphs[J]. Green Chem., 2019, 21(16): 4555-4565. |
68 | Zhou T, McBride K, Linke S, et al. Computer-aided solvent selection and design for efficient chemical processes[J]. Curr. Opin. Chem. Eng., 2020, 27: 35-44. |
69 | Gharagheizi F. Prediction of upper flammability limit percent of pure compounds from their molecular structures[J]. J. Hazard. Mater., 2009, 167(1/2/3): 507-510. |
70 | Gharagheizi F. New neural network group contribution model for estimation of lower flammability limit temperature of pure compounds[J]. Ind. Eng. Chem. Res., 2009, 48(15): 7406-7416. |
71 | 何凡, 蒋军成, 潘勇, 等. 基于电性拓扑状态指数的二元液体混合物自燃温度的预测[J]. 化工学报, 2016, 67(7): 3109-3117. |
He F, Jiang J C, Pan Y, et al. Prediction of auto-ignition temperatures for binary liquid mixtures based on electro-topological state indices[J]. CIESC Journal, 2016, 67(7): 3109-3117. | |
72 | Mayr A, Klambauer G, Unterthiner T, et al. DeepTox: toxicity prediction using deep learning[J]. Front. Environ. Sci., 2016, 3: 80. |
73 | Xu Y, Pei J, Lai L. Deep learning based regression and multiclass models for acute oral toxicity prediction with automatic chemical feature extraction[J]. J. Chem. Inf. Model., 2017, 57(11): 2672-2685. |
74 | Fernandez M, Ban F, Woo G, et al. Toxic colors: the use of deep learning for predicting toxicity of compounds merely from their graphic images[J]. J. Chem. Inf. Model., 2018, 58(8): 1533-1543. |
75 | Lazzús J A. Neural network/particle swarm method to predict flammability limits in air of organic compounds[J]. Thermochim. Acta, 2011, 512(1/2): 150-156. |
76 | Yang A, Jin S, Shen W, et al. Investigation of energy-saving azeotropic dividing wall column to achieve cleaner production via heat exchanger network and heat pump technique[J]. J. Cleaner Prod., 2019, 234: 410-422. |
77 | Austin N D, Sahinidis N V, Trahan D W. A COSMO-based approach to computer-aided mixture design[J]. Chem. Eng. Sci., 2017, 159: 93-105. |
78 | Lee Y S, Graham E J, Galindo A, et al. A comparative study of multi-objective optimization methodologies for molecular and process design[J]. Comput. Chem. Eng., 2020, 136: 106802. |
79 | Curzons A D, Constable D C, Cunningham V L. Solvent selection guide: a guide to the integration of environmental, health and safety criteria into the selection of solvents[J]. Clean Prod. Proc., 1999, 1(2): 82-90. |
80 | Su Y, Jin S, Zhang X, et al. Stakeholder-oriented multi-objective process optimization based on an improved genetic algorithm [J]. Comput. Chem. Eng., 2020, 132: 106618. |
81 | Henderson R K, Constable D J C, Alston S R, et al. Expanding GSKs solvent selection guide - embedding sustainability into solvent selection starting at medicinal chemistry[J]. Green Chem., 2011, 13(4): 854-862. |
82 | Prat D, Wells A, Hayler J, et al. CHEM21 selection guide of classical- and less classical-solvents[J]. Green Chem., 2016, 18(1): 288-296. |
83 | Jin Y, Wang H, Chugh T, et al. Data-driven evolutionary optimization: an overview and case studies[J]. IEEE T. Evolut. Comput., 2019, 23(3): 442-458. |
84 | Berhane H, Urmila M. Efficient ant colony optimization for computer aided molecular design: case study solvent selection problem[J]. Comput. Chem. Eng., 2015, 78: 1-9. |
85 | Winter R, Montanari F, Noé F, et al. Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations[J]. Chem. Sci., 2019, 10(6): 1692-1701. |
86 | Gómez-Bombarelli R, Wei J N, Duvenaud D, et al. Automatic chemical design using a data-driven continuous representation of molecules[J]. ACS Central. Sci., 2018, 4(2): 268-276. |
[1] | 程业品, 胡达清, 徐奕莎, 刘华彦, 卢晗锋, 崔国凯. 离子液体基低共熔溶剂在转化CO2中的应用[J]. 化工学报, 2023, 74(9): 3640-3653. |
[2] | 温凯杰, 郭力, 夏诏杰, 陈建华. 一种耦合CFD与深度学习的气固快速模拟方法[J]. 化工学报, 2023, 74(9): 3775-3785. |
[3] | 宋明昊, 赵霏, 刘淑晴, 李国选, 杨声, 雷志刚. 离子液体脱除模拟油中挥发酚的多尺度模拟与研究[J]. 化工学报, 2023, 74(9): 3654-3664. |
[4] | 尹刚, 李伊惠, 何飞, 曹文琦, 王民, 颜非亚, 向禹, 卢剑, 罗斌, 卢润廷. 基于KPCA和SVM的铝电解槽漏槽事故预警方法[J]. 化工学报, 2023, 74(8): 3419-3428. |
[5] | 诸程瑛, 王振雷. 基于改进深度强化学习的乙烯裂解炉操作优化[J]. 化工学报, 2023, 74(8): 3429-3437. |
[6] | 闫琳琦, 王振雷. 基于STA-BiLSTM-LightGBM组合模型的多步预测软测量建模[J]. 化工学报, 2023, 74(8): 3407-3418. |
[7] | 徐野, 黄文君, 米俊芃, 申川川, 金建祥. 多源信息融合的离心式压缩机喘振诊断方法[J]. 化工学报, 2023, 74(7): 2979-2987. |
[8] | 董茂林, 陈李栋, 黄六莲, 吴伟兵, 戴红旗, 卞辉洋. 酸性助水溶剂制备木质纳米纤维素及功能应用研究进展[J]. 化工学报, 2023, 74(6): 2281-2295. |
[9] | 高学金, 姚玉卓, 韩华云, 齐咏生. 基于注意力动态卷积自编码器的发酵过程故障监测[J]. 化工学报, 2023, 74(6): 2503-2521. |
[10] | 黄磊, 孔令学, 白进, 李怀柱, 郭振兴, 白宗庆, 李平, 李文. 油页岩添加对准东高钠煤灰熔融行为影响的研究[J]. 化工学报, 2023, 74(5): 2123-2135. |
[11] | 贠程, 王倩琳, 陈锋, 张鑫, 窦站, 颜廷俊. 基于社团结构的化工过程风险演化路径深度挖掘[J]. 化工学报, 2023, 74(4): 1639-1650. |
[12] | 吕阳光, 左培培, 杨正金, 徐铜文. 三嗪框架聚合物膜用于有机纳滤甲醇/正己烷分离[J]. 化工学报, 2023, 74(4): 1598-1606. |
[13] | 吴选军, 王超, 曹子健, 蔡卫权. 数据与物理信息混合驱动的固定床吸附穿透深度学习模型[J]. 化工学报, 2023, 74(3): 1145-1160. |
[14] | 陈瑞哲, 程磊磊, 顾菁, 袁浩然, 陈勇. 纤维增强树脂复合材料化学回收技术研究进展[J]. 化工学报, 2023, 74(3): 981-994. |
[15] | 吴心远, 刘奇磊, 曹博渊, 张磊, 都健. Group2vec:基于无监督机器学习的基团向量表示及其物性预测应用[J]. 化工学报, 2023, 74(3): 1187-1194. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||