Multi-step predictive soft sensor modeling based on STA-BiLSTM-LightGBM combined model

doi:10.11949/0438-1157.20230458

Abstract

Abstract:

In complex industrial production processes, it is necessary to establish a multi-step forecasting model for key variables in order to improve product quality, but traditional soft sensor modeling methods are difficult to focus on the complex characteristics of industrial data, resulting in inaccurate forecasting. This paper proposes a multi-step predictive soft sensor model, which is the combination of the bi-directional long short-term memory network based on spatial-temporal attention mechanism and light gradient boosting machine (STA-BiLSTM-LightGBM). Firstly, the STA-BiLSTM is trained, meanwhile the spatial-temporal attention mechanism assigns weights to the input features according to the temporal and spatial dimensions, and the BiLSTM captures the temporal features of the data. Secondly, the implicit state of the last time step of BiLSTM is used to extend the original input data, and then LightGBM is trained. By training LightGBM with a weak learner, the optimal model can be obtained through iterative training. The predicted outputs of STA-BiLSTM and LightGBM can then be weighted to obtain the predicted results using the error reciprocal method. Finally, the simulation results demonstrate that the combined model is superior to both BiLSTM and LightGBM, and it maintains high prediction accuracy even as prediction steps increases.

Key words: soft sensor, prediction, attention mechanism, neural networks, combination model, experimental verification

摘要：

在复杂工业生产过程中，为提高产品质量，建立关键变量多步预测模型非常必要，但传统软测量建模方法难以聚焦工业数据复杂特性，导致预测不准。本文提出一种基于时空注意力机制的双向长短时记忆网络与轻量级梯度提升机（spatial-temporal attention mechanism bi-directional long short-term memory network and light gradient boosting machine，STA-BiLSTM-LightGBM）的多步预测软测量模型。首先训练STA-BiLSTM，时空注意力机制从时间和空间维度为输入特征分配权重，BiLSTM捕捉数据时序特征；其次使用BiLSTM最后一个时间步的隐状态扩充原始输入数据后，训练LightGBM，利用弱学习器迭代训练得到最优模型；进而将STA-BiLSTM和LightGBM的预测输出按照误差倒数法变权求和得到预测结果。最后将该方法在工业数据集上仿真验证，结果表明组合模型预测效果优于BiLSTM和LightGBM，且随着预测步数增大，仍保持较高的预测精度。

关键词: 软测量, 预测, 注意力机制, 神经网络, 组合模型, 实验验证

CLC Number:

TP 274

Linqi YAN, Zhenlei WANG. Multi-step predictive soft sensor modeling based on STA-BiLSTM-LightGBM combined model[J]. CIESC Journal, 2023, 74(8): 3407-3418.

闫琳琦, 王振雷. 基于STA-BiLSTM-LightGBM组合模型的多步预测软测量建模[J]. 化工学报, 2023, 74(8): 3407-3418.

Figures/Tables 17

Fig.1 LSTM structure

Fig.2 BiLSTM network structure

Fig.3 Leaf-wise growth strategy

Fig.4 Flow of spatial-temporal attention level data

Fig.5 Flow chart of STA-BiLSTM-LightGBM

Fig.6 The overall architecture of the model

Table 1 Parameters of flue gas outlet temperature modeling for ethylene cracking furnace

参数	数值
STA-BiLSTM隐层节点数	12
STA-BiLSTM优化器	Adam
STA-BiLSTM评价指标	MSE
STA-BiLSTM学习率	0.01
STA-BiLSTM 训练次数	300
LightGBM目标函数	Poisson
LightGBM评价指标	RMSE和Poisson
LightGBM学习率	0.1
LightGBM叶子节点数	15
LightGBM最大深度	8
相似样本个数	5

Table 2 Evaluation metrics of 5 steps ahead flue gas outlet temperature prediction

Models	MAE	RMSE	MAPE	R²
BiLSTM	0.2150	0.2420	0.1870	0.627
STA-BiLSTM	0.0780	0.0960	0.0670	0.941
LightGBM	0.1570	0.1830	0.1370	0.787
本文模型	0.0630	0.0800	0.0550	0.958

Fig.7 Prediction error curves of BiLSTM and STA-BiLSTM

Fig.8 5 steps ahead prediction curve

Fig.9 Scatter plots of 5 steps ahead prediction

Fig.10 Bar chart of 1—5 steps ahead prediction evaluation metrics for flue gas outlet temperature

Table 3 Modeling parameter of C4 concentration

参数	数值
STA-BiLSTM隐层节点数	8
STA-BiLSTM优化器	Adam
STA-BiLSTM评价指标	MSE
STA-BiLSTM学习率	0.01
STA-BiLSTM 训练次数	100
LightGBM目标函数	regression_l2
LightGBM评价指标	MAPE
LightGBM学习率	0.1
LightGBM叶子节点数	8
LightGBM最大深度	5
相似样本个数	5

Table 4 Evaluation metrics of 6 steps ahead C4 concentration prediction

Model	MAE	RMSE	MAPE	R²
BiLSTM	0.0498	0.0606	70.48	0.890
STA-BiLSTM	0.0449	0.0563	60.22	0.905
LightGBM	0.0433	0.0557	58.64	0.907
本文模型	0.0402	0.0504	56.57	0.924

Fig.11 Comparison of prediction error between feature-extended and feature-free extended LightGBM

Fig.12 MAPE comparison of fixed-weight and variable-weight combinations

Fig.13 Bar chart of 1—6 steps ahead prediction evaluation metrics for C4 concentration

References 30

1	曹鹏飞, 罗雄麟. 化工过程软测量建模方法研究进展[J]. 化工学报, 2013, 64(3): 788-800.
	Cao P F, Luo X L. Modeling of soft sensor for chemical process[J]. CIESC Journal, 2013, 64(3): 788-800.
2	Li Y R, Yang C J, Zhang H W, et al. A model combining Seq2Seq network and LightGBM algorithm for industrial soft sensor[J]. IFAC-PapersOnLine, 2020, 53(2): 12068-12073.
3	Sun Q Q, Ge Z Q. A survey on deep learning for data-driven soft sensors[J]. IEEE Transactions on Industrial Informatics, 2021, 17(9): 5853-5866.
4	耿志强, 徐猛, 朱群雄, 等. 基于深度学习的复杂化工过程软测量模型研究与应用[J]. 化工学报, 2019, 70(2): 564-571.
	Geng Z Q, Xu M, Zhu Q X, et al. Research and application of soft measurement model for complex chemical processes based on deep learning[J]. CIESC Journal, 2019, 70(2): 564-571.
5	伊金静. 基于深度学习的工业过程软测量[D]. 杭州: 浙江大学, 2019.
	Yi J J. Soft sensing of industrial process based on deep learning[D]. Hangzhou: Zhejiang University, 2019.
6	周乐, 沈程凯, 吴超, 等. 深度融合特征提取网络及其在化工过程软测量中的应用[J]. 化工学报, 2022, 73(7): 3156-3165.
	Zhou L, Shen C K, Wu C, et al. Deep fusion feature extraction network and its application in soft sensing of chemical process[J]. CIESC Journal, 2022, 73(7): 3156-3165.
7	Yuan X F, Qi S B, Wang Y L, et al. A dynamic CNN for nonlinear dynamic feature learning in soft sensor modeling of industrial process data[J]. Control Engineering Practice, 2020, 104: 104614.
8	Wang K C, Shang C, Liu L, et al. Dynamic soft sensor development based on convolutional neural networks[J]. Industrial & Engineering Chemistry Research, 2019, 58(26): 11521-11531.
9	Sagheer A, Kotb M. Time series forecasting of petroleum production using deep LSTM recurrent networks[J]. Neurocomputing, 2019, 323: 203-213.
10	Hu J J, Wang X F, Zhang Y, et al. Time series prediction method based on variant LSTM recurrent neural network[J]. Neural Processing Letters, 2020, 52(2): 1485-1500.
11	Luo Y, Liu Q, Zhu H M, et al. Multistep flow prediction on car-sharing systems: a multi-graph convolutional neural network with attention mechanism[C]//Proceedings of the 31st International Conference on Software Engineering and Knowledge Engineering. KSI Research Inc. and Knowledge Systems Institute Graduate School, 2019: 1727-1740.
12	王雅欣, 徐宝昌, 徐朝农, 等. 基于工厂数据的注意力LSTM网络辨识方法[J]. 化工学报, 2020, 71(12): 5664-5671.
	Wang Y X, Xu B C, Xu C N, et al. Attention LSTM network identification method based on factory data[J]. CIESC Journal, 2020, 71(12): 5664-5671.
13	Chandra R, Goyal S, Gupta R. Evaluation of deep learning models for multi-step ahead time series prediction[J]. IEEE Access, 2021, 9: 83105-83123.
14	董亚伟. 基于时空注意力网络的PM_2.5多步超前预测研究[D]. 兰州: 兰州大学, 2022.
	Dong Y W. Research on multi-step advanced prediction of PM_2.5 based on spatio-temporal attention network[D]. Lanzhou: Lanzhou University, 2022.
15	Yin C R, Dai Q. A deep multivariate time series multistep forecasting network[J]. Applied Intelligence, 2022, 52(8): 8956-8974.
16	Dai Y M, Zhou Q, Leng M M, et al. Improving the Bi-LSTM model with XGBoost and attention mechanism: a combined approach for short-term power load prediction[J]. Applied Soft Computing, 2022, 130: 109632.
17	Ren J, Yu Z P, Gao G L, et al. A CNN-LSTM-LightGBM based short-term wind power prediction method based on attention mechanism[J]. Energy Reports, 2022, 8: 437-443.
18	王成. 基于记忆网络的时间序列多步预测算法研究[D]. 北京: 北京交通大学, 2020.
	Wang C. Research on multi-step prediction algorithm of time series based on memory network[D]. Beijing: Beijing Jiaotong University, 2020.
19	Ben Taieb S, Bontempi G, Atiya A F, et al. A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition[J]. Expert Systems with Applications, 2012, 39(8): 7067-7083.
20	Ben Taieb S, Sorjamaa A, Bontempi G. Multiple-output modeling for multi-step-ahead time series forecasting[J]. Neurocomputing, 2010, 73(10/11/12): 1950-1957.
21	Zhan X B, Zhang S C, Szeto W Y, et al. Multi-step-ahead traffic speed forecasting using multi-output gradient boosting regression tree[J]. Journal of Intelligent Transportation Systems, 2020, 24(2): 125-141.
22	Suradhaniwar S, Kar S, Durbha S S, et al. Time series forecasting of univariate agrometeorological data: a comparative performance evaluation via one-step and multi-step ahead forecasting strategies[J]. Sensors, 2021, 21(7): 2430.
23	Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
24	Siami-Namini S, Tavakoli N, Namin A S. The performance of LSTM and BiLSTM in forecasting time series[C]//2019 IEEE International Conference on Big Data (Big Data). Los Angeles, CA, USA: IEEE, 2020: 3285-3292.
25	罗顺桦, 王振雷, 王昕. 基于注意力机制的Multi-head CNN-LSTM软测量建模[J]. 控制工程, 2022, 29(10): 1821-1828.
	Luo S H, Wang Z L, Wang X. Multi-head CNN-LSTM soft sensor modeling based on attention mechanism[J]. Control Engineering of China, 2022, 29(10): 1821-1828.
26	Guo M H, Xu T X, Liu J J, et al. Attention mechanisms in computer vision: a survey[J]. Computational Visual Media, 2022, 8(3): 331-368.
27	Al Daoud E. Comparison between XGBoost, LightGBM and CatBoost using a home credit dataset[J]. International Journal of Computer and Information Engineering, 2019, 13(1): 6-10.
28	Zhou Y, Lin Q, Xiao D. Application of LSTM-LightGBM nonlinear combined model to power load forecasting[J]. Journal of Physics: Conference Series, 2022, 2294(1): 012035.
29	刘晴晴. 基于K近邻的变权组合预测模型及应用[J]. 科学技术创新, 2021(14): 28-29.
	Liu Q Q. Variable weight combination forecasting model based on K nearest neighbor and its application[J]. Scientific and Technological Innovation, 2021(14): 28-29.
30	彭慧来, 熊伟丽. 基于核慢特征分析和时滞估计的GPR建模[J]. 系统仿真学报, 2019, 31(8): 1562-1571.
	Peng H L, Xiong W L. GPR modeling method based on kernel slow feature analysis and time delay estimation[J]. Journal of System Simulation, 2019, 31(8): 1562-1571.

[1]	Gang YIN, Yihui LI, Fei HE, Wenqi CAO, Min WANG, Feiya YAN, Yu XIANG, Jian LU, Bin LUO, Runting LU. Early warning method of aluminum reduction cell leakage accident based on KPCA and SVM [J]. CIESC Journal, 2023, 74(8): 3419-3428.
[2]	Yuying GUO, Jiaqiang JING, Wanni HUANG, Ping ZHANG, Jie SUN, Yu ZHU, Junxuan FENG, Hongjiang LU. Water-lubricated drag reduction and pressure drop model modification for heavy oil pipeline [J]. CIESC Journal, 2023, 74(7): 2898-2907.
[3]	Xuejin GAO, Yuzhuo YAO, Huayun HAN, Yongsheng QI. Fault monitoring of fermentation process based on attention dynamic convolutional autoencoder [J]. CIESC Journal, 2023, 74(6): 2503-2521.
[4]	Yuan YU, Weiwei CHEN, Junjie FU, Jiaxiang LIU, Zhiwei JIAO. Study and prediction of flow field in the annular region of geometrically similar turbo air classifier [J]. CIESC Journal, 2023, 74(6): 2363-2373.
[5]	Weiming SHAO, Wenxue HAN, Wei SONG, Yong YANG, Can CHEN, Dongya ZHAO. Dynamic soft sensor modeling method based on distributed Bayesian hidden Markov regression [J]. CIESC Journal, 2023, 74(6): 2495-2502.
[6]	Yanhui LI, Shaoming DING, Zhouyang BAI, Yinan ZHANG, Zhihong YU, Limei XING, Pengfei GAO, Yongzhen WANG. Corrosion micro-nano scale kinetics model development and application in non-conventional supercritical boilers [J]. CIESC Journal, 2023, 74(6): 2436-2446.
[7]	Cheng YUN, Qianlin WANG, Feng CHEN, Xin ZHANG, Zhan DOU, Tingjun YAN. Deep-mining risk evolution path of chemical processes based on community structure [J]. CIESC Journal, 2023, 74(4): 1639-1650.
[8]	Xinyuan WU, Qilei LIU, Boyuan CAO, Lei ZHANG, Jian DU. Group2vec: group vector representation and its property prediction applications based on unsupervised machine learning [J]. CIESC Journal, 2023, 74(3): 1187-1194.
[9]	Jiahui CHEN, Xinze YANG, Guzhong CHEN, Zhen SONG, Zhiwen QI. A critical discussion on developing molecular property prediction models: density of ionic liquids as example [J]. CIESC Journal, 2023, 74(2): 630-641.
[10]	Kenian SHI, Jingyuan ZHENG, Yu QIAN, Siyu YANG. Two-stage stochastic programming of steam power system based on Markov chain [J]. CIESC Journal, 2023, 74(2): 807-817.
[11]	Xuejin GAO, Kun CHENG, Huayun HAN, Huihui Gao, Yongsheng QI. Fault diagnosis of chillers using central loss conditional generative adversarial network [J]. CIESC Journal, 2022, 73(9): 3950-3962.
[12]	Jing YANG, Zhenkang LIN, Jun TANG, Cheng FAN, Kening SUN. A review of fault characteristics, fault diagnosis and identification for lithium-ion battery systems [J]. CIESC Journal, 2022, 73(8): 3394-3405.
[13]	Xinjie ZHOU, Jianlin WANG, Xingcong AI, Enguang SUI, Rutong WANG. IDPC-RVM based online prediction of quality variables for multimode batch processes [J]. CIESC Journal, 2022, 73(7): 3120-3130.
[14]	Le ZHOU, Chengkai SHEN, Chao WU, Beiping HOU, Zhihuan SONG. Deep fusion feature extraction network and its application in chemical process soft sensing [J]. CIESC Journal, 2022, 73(7): 3156-3165.
[15]	Zihao QI, Wenqi ZHONG, Xi CHEN, Guanwen ZHOU, Xiaoliang ZHAO, Meijing XIN, Yi CHEN, Yongchang ZHU. Research on dynamic characteristics of cement raw meal decomposition process based on hybrid modeling [J]. CIESC Journal, 2022, 73(5): 2039-2051.