CIESC Journal ›› 2023, Vol. 74 ›› Issue (10): 4208-4217.DOI: 10.11949/0438-1157.20230858
• Process system engineering • Previous Articles Next Articles
Zhongyi ZHANG1(), Lei ZHANG1, Yu WANG2, Yachao DONG1, Jin TAO2, Yi LI2, Yi TONG2, Yu ZHUANG1(), Linlin LIU1, Jian DU1()
Received:
2023-08-18
Revised:
2023-10-17
Online:
2023-12-22
Published:
2023-10-25
Contact:
Yu ZHUANG, Jian DU
张忠义1(), 张磊1, 王宇2, 董亚超1, 陶进2, 李义2, 佟毅2, 庄钰1(), 刘琳琳1, 都健1()
通讯作者:
庄钰,都健
作者简介:
张忠义(2000—),男,硕士研究生,dgzzy@mail.dlut.edu.cn
基金资助:
CLC Number:
Zhongyi ZHANG, Lei ZHANG, Yu WANG, Yachao DONG, Jin TAO, Yi LI, Yi TONG, Yu ZHUANG, Linlin LIU, Jian DU. Data mining-based screening of key points for corn starch and sugar production process[J]. CIESC Journal, 2023, 74(10): 4208-4217.
张忠义, 张磊, 王宇, 董亚超, 陶进, 李义, 佟毅, 庄钰, 刘琳琳, 都健. 基于数据挖掘的玉米淀粉果糖生产流程的关键位点筛选[J]. 化工学报, 2023, 74(10): 4208-4217.
Add to citation manager EndNote|Ris|BibTeX
模型 | 超参数 | 范围 |
---|---|---|
RF | Dmax,En,Lmin | Dmax(8,12,step=1) En(600,1200,step=200) |
XGBoost | Dmax,En,Lmin, lr | Lmin(4,6,8,10) lr(0.01,0.001,0.0001) |
ANN | Nlayer,Nhidden,Drop | Nlayer(3,4,5,6) Nhidden(10~500) Drop(0.1~0.5) |
Table1 Hyperparameter sets and ranges
模型 | 超参数 | 范围 |
---|---|---|
RF | Dmax,En,Lmin | Dmax(8,12,step=1) En(600,1200,step=200) |
XGBoost | Dmax,En,Lmin, lr | Lmin(4,6,8,10) lr(0.01,0.001,0.0001) |
ANN | Nlayer,Nhidden,Drop | Nlayer(3,4,5,6) Nhidden(10~500) Drop(0.1~0.5) |
模型 | 训练集R2 | 测试集R2 |
---|---|---|
RF | 0.95 | 0.90 |
XGBoost | 0.99 | 0.94 |
ANN | 0.98 | 0.97 |
Table 2 Training results of the mode
模型 | 训练集R2 | 测试集R2 |
---|---|---|
RF | 0.95 | 0.90 |
XGBoost | 0.99 | 0.94 |
ANN | 0.98 | 0.97 |
位点位号 | 名称 | 相关性 | 总SHAP值 |
---|---|---|---|
TI1110_1 | 二喷维持管进口温度 | 正相关 | 2.00 |
LIA_1405 | 亚硫酸储罐液位 | 正相关 | 1.30 |
HV_1504_2 | 一级胚芽分离上料手动阀 | 正相关 | 1.00 |
15.71总淀粉D.S% | 淀粉的脂肪含量 | 正相关 | 0.90 |
PIA_1901_1 | 干燥器进气压力 | 正相关 | 0.86 |
2#水分% | 储罐中水分 | 负相关 | 0.85 |
14.78干物含量% | 溶液中干物含量 | 正相关 | 0.84 |
LIA_1401_1 | 玉米浸泡罐液位 | 正相关 | 0.79 |
LIC_101A-PV | 一套蒸发液位计 | 负相关 | 0.77 |
TC2101.MV | 闪蒸罐温度调节 | 负相关 | 0.77 |
PI1111 | 二次闪蒸罐压力 | 负相关 | 0.76 |
16.75(Be°) | 溶液的波美度 | 负相关 | 0.75 |
TI_101B | 一套蒸发进口蒸汽温度 | 负相关 | 0.75 |
TICA_1403 | 浸泡罐换热温度 | 负相关 | 0.75 |
TI2103_1 | 糖化罐的温度 | 正相关 | 0.70 |
15.46游离淀粉D.S% | 淀粉中游离淀粉含量 | 正相关 | 0.65 |
CIA_1520 | 精磨电流 | 负相关 | 0.58 |
SIC1124.MV | 液化酶的频控 | 负相关 | 0.53 |
Table 3 Correlation between measuring point names and predictive variables
位点位号 | 名称 | 相关性 | 总SHAP值 |
---|---|---|---|
TI1110_1 | 二喷维持管进口温度 | 正相关 | 2.00 |
LIA_1405 | 亚硫酸储罐液位 | 正相关 | 1.30 |
HV_1504_2 | 一级胚芽分离上料手动阀 | 正相关 | 1.00 |
15.71总淀粉D.S% | 淀粉的脂肪含量 | 正相关 | 0.90 |
PIA_1901_1 | 干燥器进气压力 | 正相关 | 0.86 |
2#水分% | 储罐中水分 | 负相关 | 0.85 |
14.78干物含量% | 溶液中干物含量 | 正相关 | 0.84 |
LIA_1401_1 | 玉米浸泡罐液位 | 正相关 | 0.79 |
LIC_101A-PV | 一套蒸发液位计 | 负相关 | 0.77 |
TC2101.MV | 闪蒸罐温度调节 | 负相关 | 0.77 |
PI1111 | 二次闪蒸罐压力 | 负相关 | 0.76 |
16.75(Be°) | 溶液的波美度 | 负相关 | 0.75 |
TI_101B | 一套蒸发进口蒸汽温度 | 负相关 | 0.75 |
TICA_1403 | 浸泡罐换热温度 | 负相关 | 0.75 |
TI2103_1 | 糖化罐的温度 | 正相关 | 0.70 |
15.46游离淀粉D.S% | 淀粉中游离淀粉含量 | 正相关 | 0.65 |
CIA_1520 | 精磨电流 | 负相关 | 0.58 |
SIC1124.MV | 液化酶的频控 | 负相关 | 0.53 |
1 | 苏鑫, 吴迎亚, 裴华健, 等. 大数据技术在过程工业中的应用研究进展[J]. 化工进展, 2016, 35(6): 1652-1659. |
Su X, Wu Y Y, Pei H J, et al. Recent development of the application of big data technology in process industries[J]. Chemical Industry and Engineering Progress, 2016, 35(6): 1652-1659. | |
2 | 何文韬, 邵诚. 工业大数据分析技术的发展及其面临的挑战[J]. 信息与控制, 2018, 47(4): 398-410. |
He W T, Shao C. The development and challenges of industrial big data analysis technology[J]. Information and Control, 2018, 47(4): 398-410. | |
3 | Tian W D, Ren Y J, Dong Y X, et al. Fault monitoring based on mutual information feature engineering modeling in chemical process[J]. Chinese Journal of Chemical Engineering, 2019, 27(10): 2491-2497. |
4 | Ge Z Q, Song Z H, Ding S X, et al. Data mining and analytics in the process industry: the role of machine learning[J]. IEEE Access, 2017, 5: 20590-20616. |
5 | Wei P F, Lu Z Z, Song J W. Variable importance analysis: a comprehensive review[J]. Reliability Engineering & System Safety, 2015, 142: 399-432. |
6 | de Barros R S M, Hidalgo J I G, de Lima Cabral D R. Wilcoxon rank sum test drift detector[J]. Neurocomputing, 2018, 275: 1954-1963. |
7 | Malik H, Yadav A K. A novel hybrid approach based on relief algorithm and fuzzy reinforcement learning approach for predicting wind speed[J]. Sustainable Energy Technologies and Assessments, 2021, 43: 100920. |
8 | Li H D, Xu Q S, Liang Y Z. Random frog: an efficient reversible jump Markov chain Monte Carlo-like approach for variable selection with applications to gene selection and disease classification[J]. Analytica Chimica Acta, 2012, 740: 20-26. |
9 | Li Z B, Liu P, Wang W, et al. Using support vector machine models for crash injury severity analysis[J]. Accident Analysis & Prevention, 2012, 45: 478-486. |
10 | 刘立, 蒋鹏, 王伟, 等. 基于过程模拟和随机森林模型的生物质制氢过程因素分析与预测[J]. 化工学报, 2022, 73(11): 5230-5239. |
Liu L, Jiang P, Wang W, et al. Analysis and prediction of process factors of biomass hydrogen production based on process simulation and stochastic forest model[J]. CIESC Journal, 2022, 73(11): 5230-5239. | |
11 | 吴潇雨, 和敬涵, 张沛, 等. 基于灰色投影改进随机森林算法的电力系统短期负荷预测[J]. 电力系统自动化, 2015, 39(12): 50-55. |
Wu X Y, He J H, Zhang P, et al. Power system short-term load forecasting based on improved random forest with grey relation projection[J]. Automation of Electric Power Systems, 2015, 39(12): 50-55. | |
12 | 李欣海. 随机森林模型在分类与回归分析中的应用[J]. 应用昆虫学报, 2013, 50(4): 1190-1197. |
Li X H. Using "random forest" for classification and regression[J]. Chinese Journal of Applied Entomology, 2013, 50(4): 1190-1197. | |
13 | 张钰, 陈珺, 王晓峰, 等. Xgboost在滚动轴承故障诊断中的应用[J]. 噪声与振动控制, 2017, 37(4): 166-170, 179. |
Zhang Y, Chen J, Wang X F, et al. Application of Xgboost to fault diagnosis of rolling bearings[J]. Noise and Vibration Control, 2017, 37(4): 166-170, 179. | |
14 | 秦卫星, 胡惠仁, 廖紫欣, 等. 基于BP-Olden方法的均质地基基质吸力影响因素显著性分析[J]. 安全与环境学报, 2022, 22(1): 103-108. |
Qin W X, Hu H R, Liao Z X, et al. Significant analysis of influence factors on matric suction of homogeneous foundation based on the BP-Olden method[J]. Journal of Safety and Environment, 2022, 22(1): 103-108. | |
15 | Tong Y, Shu M, Li M X, et al. A neural network-based production process modeling and variable importance analysis approach in corn to sugar factory[J]. Frontiers of Chemical Science and Engineering, 2023, 17(3): 358-371. |
16 | Li X H, Xiong H Y, Li X J, et al. Interpretable deep learning: interpretation, interpretability, trustworthiness, and beyond[J]. Knowledge and Information Systems, 2022, 64(12): 3197-3234. |
17 | 纪守领, 李进锋, 杜天宇, 等. 机器学习模型可解释性方法、应用与安全研究综述[J]. 计算机研究与发展, 2019, 56(10): 2071-2096. |
Ji S L, Li J F, Du T Y, et al. Survey on techniques, applications and security of machine learning interpretability[J]. Journal of Computer Research and Development, 2019, 56(10): 2071-2096. | |
18 | Kitani R, Iwata S. Verification of interpretability of phase-resolved partial discharge using a CNN with SHAP[J]. IEEE Access, 2023, 11: 4752-4762. |
19 | Lin K, Gao Y Z. Model interpretability of financial fraud detection by group SHAP[J]. Expert Systems with Applications, 2022, 210: 118354. |
20 | Chehreh Chelgani S, Nasiri H, Alidokht M. Interpretable modeling of metallurgical responses for an industrial coal column flotation circuit by XGBoost and SHAP—a "conscious-lab" development[J]. International Journal of Mining Science and Technology, 2021, 31(6): 1135-1144. |
21 | Xu Y R, Zeng X H, Bernard S, et al. Data-driven prediction of neutralizer pH and valve position towards precise control of chemical dosage in a wastewater treatment plant[J]. Journal of Cleaner Production, 2022, 348: 131360. |
22 | Lin W C, Tsai C F. Missing value imputation: a review and analysis of the literature (2006—2017)[J]. Artificial Intelligence Review, 2020, 53(2): 1487-1509. |
23 | Boukerche A, Zheng L N, Alfandi O. Outlier detection: methods, models, and classification[J]. ACM Computing Surveys, 2021, 53(3):1-37. |
24 | Xu S, Lu B, Baldea M, et al. Data cleaning in the process industries[J]. Reviews in Chemical Engineering, 2015, 31(5): 453-490. |
25 | 王雅欣, 徐宝昌, 徐朝农, 等. 基于工厂数据的注意力LSTM网络辨识方法[J]. 化工学报, 2020, 71(12): 5664-5671. |
Wang Y X, Xu B C, Xu C N, et al. Attention LSTM network identification method based on factory data[J]. CIESC Journal, 2020, 71(12): 5664-5671. | |
26 | 李倩, 韩斌, 汪旭祥. 基于模糊孤立森林算法的多维数据异常检测方法[J]. 计算机与数字工程, 2020, 48(4): 862-866. |
Li Q, Han B, Wang X X. Multidimensional data anomaly detection method based on fuzzy isolated forest algorithm[J]. Computer & Digital Engineering, 2020, 48(4): 862-866. | |
27 | 杜庆峰, 张双俐, 张晨曦, 等. 基于均值滤波去噪和XGBoost算法的泥水平衡盾构掘进速度预测方法[J]. 现代隧道技术, 2022, 59(6): 14-23. |
Du Q F, Zhang S L, Zhang C X, et al. Prediction method of driving speed of slurry balance shield based on mean filter denoising and XGBoost algorithm[J]. Modern Tunnelling Technology, 2022, 59(6): 14-23. | |
28 | 李占山, 刘兆赓. 基于XGBoost的特征选择算法[J]. 通信学报, 2019, 40(10): 101-108. |
Li Z S, Liu Z G. Feature selection algorithm based on XGBoost[J]. Journal on Communications, 2019, 40(10): 101-108. | |
29 | Tian J L, Jiang Y C, Zhang J S, et al. High-performance fault classification based on feature importance ranking-XgBoost approach with feature selection of redundant sensor data[J]. Current Chinese Science, 2022, 2(3): 243-251. |
30 | Lundberg S M, Lee S I. A unified approach to interpreting model predictions[C]//Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 2017: 4765-4774. |
31 | Lundberg S M, Erion G, Chen H, et al. From local explanations to global understanding with explainable AI for trees[J]. Nature Machine Intelligence, 2020, 2(1): 56-67. |
32 | 马井喜, 王佳男. 湿磨工艺生产玉米淀粉的研究进展[J]. 江苏调味副食品, 2014, 31(1): 14-15, 19. |
Ma J X, Wang J N. Research progress of corn starch wet grinding made by wet grinding technology[J]. Jiangsu Condiment and Subsidiary Food, 2014, 31(1): 14-15, 19. |
[1] | Kaijie WEN, Li GUO, Zhaojie XIA, Jianhua CHEN. A rapid simulation method of gas-solid flow by coupling CFD and deep learning [J]. CIESC Journal, 2023, 74(9): 3775-3785. |
[2] | Chengying ZHU, Zhenlei WANG. Operation optimization of ethylene cracking furnace based on improved deep reinforcement learning algorithm [J]. CIESC Journal, 2023, 74(8): 3429-3437. |
[3] | Linqi YAN, Zhenlei WANG. Multi-step predictive soft sensor modeling based on STA-BiLSTM-LightGBM combined model [J]. CIESC Journal, 2023, 74(8): 3407-3418. |
[4] | Gang YIN, Yihui LI, Fei HE, Wenqi CAO, Min WANG, Feiya YAN, Yu XIANG, Jian LU, Bin LUO, Runting LU. Early warning method of aluminum reduction cell leakage accident based on KPCA and SVM [J]. CIESC Journal, 2023, 74(8): 3419-3428. |
[5] | Ye XU, Wenjun HUANG, Junpeng MI, Chuanchuan SHEN, Jianxiang JIN. Surge diagnosis method of centrifugal compressor based on multi-source data fusion [J]. CIESC Journal, 2023, 74(7): 2979-2987. |
[6] | Xuejin GAO, Yuzhuo YAO, Huayun HAN, Yongsheng QI. Fault monitoring of fermentation process based on attention dynamic convolutional autoencoder [J]. CIESC Journal, 2023, 74(6): 2503-2521. |
[7] | Lei HUANG, Lingxue KONG, Jin BAI, Huaizhu LI, Zhenxing GUO, Zongqing BAI, Ping LI, Wen LI. Effect of oil shale addition on ash fusion behavior of Zhundong high-sodium coal [J]. CIESC Journal, 2023, 74(5): 2123-2135. |
[8] | Cheng YUN, Qianlin WANG, Feng CHEN, Xin ZHANG, Zhan DOU, Tingjun YAN. Deep-mining risk evolution path of chemical processes based on community structure [J]. CIESC Journal, 2023, 74(4): 1639-1650. |
[9] | Xuanjun WU, Chao WANG, Zijian CAO, Weiquan CAI. Deep learning model of fixed bed adsorption breakthrough curve hybrid-driven by data and physical information [J]. CIESC Journal, 2023, 74(3): 1145-1160. |
[10] | Xinyuan WU, Qilei LIU, Boyuan CAO, Lei ZHANG, Jian DU. Group2vec: group vector representation and its property prediction applications based on unsupervised machine learning [J]. CIESC Journal, 2023, 74(3): 1187-1194. |
[11] | Jianghuai ZHANG, Zhong ZHAO. Robust minimum covariance constrained control for C3 hydrogenation process and application [J]. CIESC Journal, 2023, 74(3): 1216-1227. |
[12] | Xuejin GAO, Kun CHENG, Huayun HAN, Huihui Gao, Yongsheng QI. Fault diagnosis of chillers using central loss conditional generative adversarial network [J]. CIESC Journal, 2022, 73(9): 3950-3962. |
[13] | Yalin WANG, Yuqing PAN, Chenliang LIU. Intermittent process monitoring based on GSA-LSTM dynamic structure feature extraction [J]. CIESC Journal, 2022, 73(9): 3994-4002. |
[14] | Le ZHOU, Chengkai SHEN, Chao WU, Beiping HOU, Zhihuan SONG. Deep fusion feature extraction network and its application in chemical process soft sensing [J]. CIESC Journal, 2022, 73(7): 3156-3165. |
[15] | Zhe SUN, Huaqiang JIN, Kang LI, Jiangping GU, Yuejin HUANG, Xi SHEN. Fault diagnosis method of refrigeration and air-conditioning system based on digitized knowledge representation [J]. CIESC Journal, 2022, 73(7): 3131-3144. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||