化工学报

• •    

基于机器学习算法的生物质热解产物产率预测研究

刘晓宇1,2(), 李奇辰1,2, 霍丽丽1,2(), 赵立欣1,2, 姚宗路1,2, 孙培豪1,2, 贾吉秀1,2, 朱本海1,2   

  1. 1.中国农业科学院农业环境与可持续发展研究所,北京 100081
    2.农业农村部华北平原农业绿色低碳重点实验室,北京 100081
  • 收稿日期:2025-09-23 修回日期:2026-01-05 出版日期:2026-01-06
  • 通讯作者: 霍丽丽
  • 作者简介:刘晓宇(2000—),男,硕士研究生,lxy1635153121@163.com
  • 基金资助:
    国家重点研发计划(2022YFD2002102);中央级公益性科研院所基本科研业务费专项(CAAS-ZDRW202417)

Prediction of biomass pyrolysis product yield based on machine learning algorithms

Xiaoyu LIU1,2(), Qichen LI1,2, Lili HUO1,2(), Lixin ZHAO1,2, Zonglu YAO1,2, Peihao SUN1,2, Jixiu JIA1,2, Benhai ZHU1,2   

  1. 1.Institute of Agricultural Environment and Sustainable Development, Chinese Academy of Agricultural Sciences, Beijing 100081, China
    2.Key Laboratory of Green and Low-carbon Agriculture in North China Plain, Ministry of Agriculture and Rural Affairs, Beijing 100081, China
  • Received:2025-09-23 Revised:2026-01-05 Online:2026-01-06
  • Contact: Lili HUO

摘要:

为实现生物质热解产物的定向调控,构建了一种基于机器学习的产率预测与工艺优化模型。整合文献与实验数据,构建了包含578组有效数据的数据集,并利用箱线图与聚类算法对数据进行预处理与特征筛选,在此基础上采用贝叶斯优化方法对随机森林、梯度提升决策树等四种基模型进行调优,并通过堆叠(Stacking)集成策略构建了高精度的产率预测元模型,结果表明该模型生物炭、焦油和热解气产率预测的决定系数(R²)分别高达0.92、0.90和0.94,结合SHAP与GINI的特征重要性分析表明,生物炭产率主要受热解温度与碳含量影响,焦油产率取决于木质素与半纤维素含量,而热解气产率由半纤维素含量与热解温度共同主导,进一步通过偏依赖图分析揭示了产物生成的最佳工艺区间:低温慢速升温(450℃<T<550℃,HR<20℃/min)利于生物炭生成,中温短时(600℃<T<700℃,40min<RT<50min)利于焦油富集,而高温中速升温(T>750℃,15℃/min<HR<25℃/min)可最大化热解气产率,为生物质热解的智能化调控和产业化应用提供了高效的预测工具与理论基础。

关键词: 生物质, 生物炭, 热解气, 焦油, 算法, 预测, 机器学习

Abstract:

To achieve the directional regulation of biomass pyrolysis products, a machine learning-based model for yield prediction and process optimization was developed. A dataset containing 578 valid samples was established by integrating literature and experimental data, and data preprocessing and feature selection were performed using boxplot and clustering algorithms. On this basis, Bayesian optimization was applied to tune four base learners, including Random Forest and Gradient Boosting Decision Tree models, and a high-accuracy yield prediction meta-model was constructed through a stacking ensemble strategy. The results showed that the coefficients of determination (R²) for biochar, tar, and pyrolysis gas yields reached 0.92, 0.90, and 0.94, respectively. Feature importance analysis using SHAP and GINI indicated that biochar yield was mainly influenced by pyrolysis temperature and carbon content, tar yield was primarily affected by lignin and hemicellulose contents, and gas yield was dominated by hemicellulose content and pyrolysis temperature. Furthermore, partial dependence plot (PDP) analysis revealed the optimal operating regions for product formation: low temperature and slow heating (500℃, HR < 20℃/min) favored biochar production, moderate temperature and short residence time (600℃,40 min < RT < 50 min) promoted tar enrichment, while high temperature with moderate heating rate (>750℃, 15℃/min < HR < 25℃/min) maximized gas yield, providing an efficient predictive tool and theoretical basis for the intelligent control and industrial application of biomass pyrolysis.

Key words: biomass, biochar, pyrolysis gas, tar, algorithm, prediction, machine learning

中图分类号: