化工学报 ›› 2021, Vol. 72 ›› Issue (3): 1487-1495.DOI: 10.11949/0438-1157.20201880

• 过程系统工程 • 上一篇    下一篇

基于深度学习预测有机光伏电池能量转换效率

于程远(),吴金奎,周利(),吉旭,戴一阳,党亚固   

  1. 四川大学化学工程学院,四川 成都 610065
  • 收稿日期:2020-12-20 修回日期:2020-12-27 出版日期:2021-03-05 发布日期:2021-03-05
  • 通讯作者: 周利
  • 作者简介:于程远(1996—),男,硕士研究生,843766990@qq.com
  • 基金资助:
    中央高校基本科研业务费专项资金(YJ201838);国家自然科学基金项目(21776183)

Prediction of energy conversion efficiency of organic solar cells based on deep learning

YU Chengyuan(),WU Jinkui,ZHOU Li(),JI Xu,DAI Yiyang,DANG Yagu   

  1. School of Chemical Engineering, Sichuan University, Chengdu 610065, Sichuan, China
  • Received:2020-12-20 Revised:2020-12-27 Online:2021-03-05 Published:2021-03-05
  • Contact: ZHOU Li

摘要:

采用一种针对有机化合物提出的类语言分子描述符对哈佛清洁能源项目数据库(CEPDB)中29000个有机太阳能电池供体分子进行描述,分子将基于最近邻子图理论被分解成片段(词),并利用广度优先搜索算法将片段排列成一定的序列(句子),在每个片段的信息被嵌入一个数值向量后,每个分子可表示为一个信息矩阵。在此基础上,通过一个深层神经网络提取嵌入信息,并与对应材料的光电转换效率(PCE)关联,获得了决定系数(R2)为0.97、均方误差(MSE)为0.16的预测结果。与现有方法的比较表明该方法在精度上具有竞争力。在建模过程中引入注意力机制,识别出了几个对PCE值具有决定性意义的分子片段,可为有机光伏材料的逆向设计提供指导信息。

关键词: 有机化合物, 太阳能, 类语言描述符, 深度学习, 预测, 光电转换效率

Abstract:

A language-like descriptor for organic compounds was used to describe 29000 organic solar cell donor molecules collected from the Harvard Clean Energy Project Database (CEPDB). Inspired by the similarity between organic chemistry and natural language, these molecules were decomposed into fragments (words) based on the nearest neighbor subgraph theory, and these fragments were arranged into a certain sequence (sentences) by the breadth first search algorithm. After the information of each fragment was embedded into a numerical vector, each molecule can be represented by an information matrix. This matrix is a descriptor called g-FSI, which can reflect the composition and structure information of molecules. The descriptor was then parsed by a deep neural network to extract the embedded information and correlate to the corresponding PCE. The prediction model has obtained the prediction result in which the determination coefficient (R2) is 0.97 and the mean square error (MSE) is 0.16. Compared with the existing research, this model is competitive in accuracy of prediction. The attention mechanism is introduced in the modeling process, and several molecular fragments that are decisive for the PCE value are identified, which can provide guidance information for the reverse design of organic photovoltaic materials.

Key words: organic compounds, solar energy, language-like descriptor, deep learning, prediction, power conversion efficiency

中图分类号: