Multi-objective optimization of papermaking wastewater based on multi-agent reinforcement learning

doi:10.11949/0438-1157.20241058

Abstract

Abstract:

Papermaking wastewater treatment process is susceptible to uncertain factors such as production process conditions switching and raw material heterogeneity. In the context of the coordinated development of pollution reduction and carbon reduction in the industry, how to ensure the discharge of sewage treatment in the water quality standard, and achieve synchronous reduction of treatment costs, energy consumption, and greenhouse gas emissions is an important issue restricting the development of the industry. In this paper, a multi-objective wastewater optimization method based on Kriging method and high dimensional model representation (HDMR) is proposed for the dynamic uncertainty of papermaking wastewater treatment. In this study, benchmark simulation model No. 1 (BSM1) was used to simulate the biochemical and precipitation processes of papermaking wastewater treatment process. Based on biochemical metabolism mechanism and data fusion, a Kriging-HDMR proxy model for real-time solving of greenhouse gas emissions in wastewater treatment process was established. By integrating the agent model into reinforcement learning, a multi-agent system based on“solving-decision-observation” for dynamic optimization of the wastewater treatment process was established, and a coordinated multi-objective optimization model for pollution reduction and carbon reduction was obtained. The study scenario results show that compared with the open-loop system, the dynamic optimization system can reduce operating cost by 4.10%, energy consumption by 22.10%, and greenhouse gas emissions by 10.30%, and can obtain and maintain an effective multi-objective dynamic optimization control strategy.

Key words: wastewater, optimization, multi-objective, greenhouse gases, operational cost, energy consumption, deep reinforcement learning

摘要：

造纸污水处理过程易受生产工艺条件切换、原材料异质性等不确定因素影响，在行业号召减污降碳协同发展背景下，如何保障污水处理水质达标排放，并实现同步降低处理成本、能源消耗以及温室气体排放，是制约行业发展的重要问题。本文面向造纸污水处理动态不确定性提出了一种基于Kriging法以及高维模型表征（HDMR）的多智能体强化学习的污水多目标优化方法，采用基准仿真1号模型（BSM1）模拟造纸污水处理过程的生化和沉淀过程，基于生化代谢机理和数据融合建立面向污水处理过程温室气体排放量实时求解的Kriging-HDMR代理模型，通过将代理模型集成至强化学习建立基于“求解-决策-观察”动态优化污水处理过程的多智能体系统，得到了减污降碳协同多目标优化模型。研究场景结果表明，相较于开环系统，该动态优化系统可降低运行成本4.10%，减少能源消耗22.10%，减少温室气体排放10.30%，能够得到并维持有效的多目标动态优化控制策略。

关键词: 污水, 优化, 多目标, 温室气体, 运行成本, 能源消耗, 深度强化学习

CLC Number:

X 793

Zhenglei HE, Dingding HU. Multi-objective optimization of papermaking wastewater based on multi-agent reinforcement learning[J]. CIESC Journal, 2025, 76(4): 1617-1634.

何正磊, 胡丁丁. 基于多智能体强化学习的造纸污水多目标优化[J]. 化工学报, 2025, 76(4): 1617-1634.

Figures/Tables 18

Fig.1 System structure diagram of BSM1

Fig.2 Problem description of multi-objective optimization for PWTP

Fig.3 Testing results of Kriging-HDMR model in R2, RAAE, and RMAE in terms of box chart

Fig.4 Results of three sensitive methods

Fig.5 Pearson correlation calculation result heat map

Fig.6 DQN algorithm structure

Fig.7 PPO algorithm structure

Fig.8 DDPG algorithm structure

Fig.9 Kriging-HDMR MARL model

Fig.10 Inflow data of papermaking wastewater treatment process

Fig.11 Reward training tracking graphs for different agents

Fig.12 Actions of three agents

Fig.13 Comparison of water quality of three systems

Fig.14 Comparison of operating costs of three systems

Fig.15 Comparison of energy consumption of three systems

Fig.16 Comparison of total GHG emissions of three systems

Fig.17 Comparison of direct GHG emissions from three systems

Fig.18 Comparison of indirect GHG emissions from three systems

References 40

1	He Z L, Qian J W, Li J G, et al. Data-driven soft sensors of papermaking process and its application to cleaner production with multi-objective optimization[J]. Journal of Cleaner Production, 2022, 372: 133803.
2	Man Y, Li J G, Hong M N, et al. Energy transition for the low-carbon pulp and paper industry in China[J]. Renewable and Sustainable Energy Reviews, 2020, 131: 109998.
3	Wang Y, Yang G J, Wu B W, et al. Papermaking wastewater treatment coupled to 2,3-butanediol production by engineered psychrotrophic Raoultella terrigena[J]. Journal of Hazardous Materials, 2023, 458: 131994.
4	An X J, Zong Z B, Zhang Q H, et al. Novel thermo-alkali-stable cellulase-producing Serratia sp. AXJ-M cooperates with Arthrobacter sp. AXJ-M1 to improve degradation of cellulose in papermaking black liquor[J]. Journal of Hazardous Materials, 2022, 421: 126811.
5	Feng Z Q, Chen H L, Li H Q, et al. Preparation, characterization, and application of magnetic activated carbon for treatment of biologically treated papermaking wastewater[J]. Science of the Total Environment, 2020, 713: 136423.
6	Shen W H, Chen X Q, Corriou J P. Application of model predictive control to the BSM1 benchmark of wastewater treatment process[J]. Computers & Chemical Engineering, 2008, 32(12): 2849-2856.
7	Zhao J Y, Cao J S, Zhao Y J, et al. Catalytic ozonation treatment of papermaking wastewater by Ag-doped NiFe₂O₄: performance and mechanism[J]. Journal of Environmental Sciences, 2020, 97: 75-84.
8	Niu G Q, Liu Y, Zhou J, et al. SBR-extended Kalman filter model-based fault diagnosis and signal reconstruction for the papermaking wastewater treatment process[J]. Journal of Water Process Engineering, 2023, 56: 104420.
9	Wang Z F, Man Y, Hu Y S, et al. A deep learning based dynamic COD prediction model for urban sewage[J]. Environmental Science: Water Research & Technology, 2019, 5(12): 2210-2218.
10	Man Y, Shen W H, Chen X Q, et al. Dissolved oxygen control strategies for the industrial sequencing batch reactor of the wastewater treatment process in the papermaking industry[J]. Environmental Science: Water Research & Technology, 2018, 4(5): 654-662.
11	Carvalho Neves L, Beber de Souza J, de Souza Vidal C M, et al. Phytotoxicity indexes and removal of color, COD, phenols and ISA from pulp and paper mill wastewater post-treated by UV/H₂O₂ and photo-Fenton[J]. Ecotoxicology and Environmental Safety, 2020, 202: 110939.
12	Croll H C, Ikuma K, Ong S K, et al. Reinforcement learning applied to wastewater treatment process control optimization: approaches, challenges, and path forward[J]. Critical Reviews in Environmental Science and Technology, 2023, 53(20): 1775-1794.
13	付文韬. 基于神经网络的污水处理多变量控制方法研究[D]. 北京: 北京工业大学, 2016.
	Fu W T. Research on multivariable control of sewage treatment based on neural networks[D]. Beijing: Beijing University of Technology, 2016.
14	陈文亮, 姚重华, 吕锡武. 活性污泥工艺的多目标优化分析[J]. 环境科学学报, 2013, 33(7): 1918-1925.
	Chen W L, Yao C H, Lü X W. Analysis of activated sludge process by multi-objective optimization[J]. Acta Scientiae Circumstantiae, 2013, 33(7): 1918-1925.
15	He Z L, Liu C, Wang Y T, et al. Optimal operation of wind-solar-thermal collaborative power system considering carbon trading and energy storage[J]. Applied Energy, 2023, 352: 121993.
16	Hernández-del-Olmo F, Gaudioso E, Dormido R, et al. Tackling the start-up of a reinforcement learning agent for the control of wastewater treatment plants[J]. Knowledge-Based Systems, 2018, 144: 9-15.
17	Petsagkourakis P, Sandoval I O, Bradford E, et al. Reinforcement learning for batch bioprocess optimization[J]. Computers & Chemical Engineering, 2020, 133: 106649.
18	诸程瑛, 王振雷. 基于改进深度强化学习的乙烯裂解炉操作优化[J]. 化工学报, 2023, 74(8): 3429-3437.
	Zhu C Y, Wang Z L. Operation optimization of ethylene cracking furnace based on improved deep reinforcement learning algorithm[J]. CIESC Journal, 2023, 74(8): 3429-3437.
19	章康树. 基于神经网络的污水处理自适应控制方法初探[D]. 杭州: 浙江大学, 2019.
	Zhang K S. Preliminary study on adaptive control of sewage treatment based on neural networks[D]. Hangzhou: Zhejiang University, 2019.
20	Min H T, Xiong X Y, Yang F, et al. An energy-efficient driving method for connected and automated vehicles based on reinforcement learning[J]. Machines, 2023, 11(2): 168.
21	Pal P, Thakura R, Chakrabortty S. Performance analysis and optimization of an advanced pharmaceutical wastewater treatment plant through a visual basic software tool (PWWT.VB)[J]. Environmental Science and Pollution Research, 2016, 23(10): 9901-9917.
22	Muoio R, Palli L, Ducci I, et al. Optimization of a large industrial wastewater treatment plant using a modeling approach: a case study[J]. Journal of Environmental Management, 2019, 249: 109436.
23	Faubert P, Barnabé S, Bouchard S, et al. Pulp and paper mill sludge management practices: what are the challenges to assess the impacts on greenhouse gas emissions?[J]. Resources, Conservation and Recycling, 2016, 108: 107-133.
24	Flores-Alsina X, Corominas L, Snip L, et al. Including greenhouse gas emissions during benchmarking of wastewater treatment plant control strategies[J]. Water Research, 2011, 45(16): 4700-4710.
25	Gémar G, Gómez T, Molinos-Senante M, et al. Assessing changes in eco-productivity of wastewater treatment plants: the role of costs, pollutant removal efficiency, and greenhouse gas emissions[J]. Environmental Impact Assessment Review, 2018, 69: 24-31.
26	He Z L, Lu Z H, Wang X, et al. Multiobjective optimization of papermaking wastewater treatment processes under economic, energy, and environmental goals[J]. Environmental Science & Technology, 2024, 58(36): 16076-16086.
27	Hanawal M K, Liu H, Zhu H H, et al. Learning policies for Markov decision processes from data[J]. arxiv: 1701.05954. .
28	Bäuerle N, Glauner A. Markov decision processes with recursive risk measures[J]. European Journal of Operational Research, 2022, 296(3): 953-966.
29	王大芬, 唐莉丽, 张鑫焱, 等. 基于时差的多输出tri-training异构软测量建模[J]. 化工学报,2024, 75(9): 3242-3254.
	Wang D F, Tang L L, Zhang X Y, et al. Multi-output tri-training heterogeneous soft sensor modeling based on time difference[J]. CIESC Journal, 2024, 75(9): 3242-3254.
30	赵杨, 熊伟丽. 基于多策略自适应差分进化算法的污水处理过程多目标优化控制[J]. 化工学报, 2021, 72(4): 2167-2177.
	Zhao Y, Xiong W L. Multi-objective optimization control of wastewater treatment process based on multi-strategy adaptive differential evolution algorithm[J]. CIESC Journal, 2021, 72(4): 2167-2177.
31	Henze M, Gujer W, Mino T, et al. Activated sludge models ASM1, ASM2, ASM2d and ASM3[J]. Water Intelligence Online, 2015, 5: 9781780402369.
32	Takács I, Patry G G, Nolasco D. A dynamic model of the clarification-thickening process[J]. Water Research, 1991, 25(10): 1263-1271.
33	He Z L, Hong M N, Zheng H Z, et al. Towards low-carbon papermaking wastewater treatment process based on Kriging surrogate predictive model[J]. Journal of Cleaner Production, 2023, 425: 139039.
34	Chatterjee T, Chowdhury R. Refined sparse Bayesian learning configuration for stochastic response analysis[J]. Probabilistic Engineering Mechanics, 2018, 52: 15-27.
35	李雨. 基于数据驱动的原油管道电耗预测方法研究[D]. 北京: 中国石油大学, 2021.
	Li Y. Research on data-driven prediction method of crude oil pipeline power consumption[D]. Beijing: China University of Petroleum, 2021.
36	He Z L, Tran K P, Thomassey S, et al. A deep reinforcement learning based multi-criteria decision support system for optimizing textile chemical process[J]. Computers in Industry, 2021, 125: 103373.
37	Mayer S, Classen T, Endisch C. Modular production control using deep reinforcement learning: proximal policy optimization[J]. Journal of Intelligent Manufacturing, 2021, 32(8): 2335-2351.
38	陆造好, 满奕, 李继庚, 等. 基于深度强化学习的造纸废水处理过程多目标优化[J]. 中国造纸, 2023, 42(3): 13-22, 103.
	Lu Z H, Man Y, Li J G, et al. Multi-objective optimization of papermaking wastewater treatment process based on deep reinforcement learning[J]. China Pulp & Paper, 2023, 42(3): 13-22, 103.
39	黄菲妮. 造纸污水生化处理过程温室气体减排的溶解氧优化控制[D]. 广州: 华南理工大学, 2020.
	Huang F N. Optimal control of dissolved oxygen in greenhouse gas emission reduction during biochemical treatment of papermaking wastewater[D]. Guangzhou: South China University of Technology, 2020.
40	Man Y, Hu Y S, Ren J Z. Forecasting COD load in municipal sewage based on ARMA and VAR algorithms[J]. Resources, Conservation and Recycling, 2019, 144: 56-64.

[1]	Xiangrui ZHAI, Wei ZHANG, Qianqian ZHANG, Jiuzhe QU, Xufei YANG, Yajun DENG, Bo YU. Active heat transfer enhancement technology for solid-liquid phase change energy storage based on external field disturbance [J]. CIESC Journal, 2025, 76(4): 1432-1446.
[2]	Junde ZHAO, Aiguo ZHOU, Yanlin CHEN, Jiale ZHENG, Tianshu GE. Current status of energy consumption of adsorption CO₂ direct air capture [J]. CIESC Journal, 2025, 76(4): 1375-1390.
[3]	Chengcheng XU, Suola SHAO, Wenjian WEI, Xu ZHENG. Research on heating performance of direct-condensation thermal storage aluminum radiant heating panel under multiple working conditions [J]. CIESC Journal, 2025, 76(4): 1545-1558.
[4]	Yaqi HOU, Wei ZHANG, Hong ZHANG, Feiyu GAO, Jiahua HU. Optimization of LBM multiphase flow models based on machine learning and particle swarm algorithm [J]. CIESC Journal, 2025, 76(3): 1120-1132.
[5]	Liwen ZHAO, Guilian LIU. Performance enhancement and parameter optimization of complex catalytic reaction system based on system integration [J]. CIESC Journal, 2025, 76(3): 1111-1119.
[6]	Qin SUN, Guoqing ZHOU, Wanling ZHAI, Shan GAO, Qianqian LUO, Jian QU. Heat transfer characteristics of topology optimized channel flat-plate pulsating heat pipe under local multiple heat sources [J]. CIESC Journal, 2025, 76(3): 1006-1017.
[7]	Ke LI, Biping XIN, Jian WEN. Sequential quadratic programming optimization of continuous variable density multi-layer insulation coupled with vapor cooled shield in liquid hydrogen storage tank [J]. CIESC Journal, 2025, 76(3): 985-994.
[8]	Lyusheng ZHANG, Zhihong WANG, Qing LIU, Xuewen LI, Renmin TAN. Research progress in carbon dioxide capture using liquid-liquid phase change absorbents [J]. CIESC Journal, 2025, 76(3): 933-950.
[9]	Xinying LI, Chang SU, Chao GUO, Jian PANG, Chao WANG, Chun LI. Application and optimization of CRISPR editing technology in Streptomyces [J]. CIESC Journal, 2025, 76(3): 922-932.
[10]	Jing ZHANG, Yue YUAN, Yanmei LIU, Zhiwen WANG, Tao CHEN. Advance on the preparation of itaconic acid by biological method [J]. CIESC Journal, 2025, 76(3): 909-921.
[11]	Gonghan GUO, Huidian DING, Qiang LI, Shengkun JIA, Xing QIAN, Yang YUAN, Haisheng CHEN, Yiqing LUO. Dynamic Bayesian optimization method for batch distillation operation process [J]. CIESC Journal, 2025, 76(2): 755-768.
[12]	Jinhao BAI, Xiaoping GUAN, Ning YANG. Analysis and optimization of flow characteristics in a filter-press water electrolyzer mastoid plate [J]. CIESC Journal, 2025, 76(2): 584-595.
[13]	Nannan XIE, He CHEN, Guanghua YE, Zhongming SHU, Songbao FU, Xinggui ZHOU. Interaction of multiple impellers for gas-liquid stirred tank and optimization of their combinations [J]. CIESC Journal, 2025, 76(2): 564-575.
[14]	Chao REN, Kai WANG, Jie HAN, Chunhua YANG. Event-time triggered slow time-varying industrial process dynamic scheduling method [J]. CIESC Journal, 2025, 76(1): 256-265.
[15]	Haidong LI, Qiqi ZHANG, Lu YANG, Naeem AKRAM, Chenglin CHANG, Wenlong MO, Weifeng SHEN. Detailed design of shell-and-tube heat exchanger using intelligent evolutionary algorithms [J]. CIESC Journal, 2025, 76(1): 241-255.