CIESC Journal ›› 2025, Vol. 76 ›› Issue (S1): 336-342.DOI: 10.11949/0438-1157.20241331

• Energy and environmental engineering • Previous Articles    

Application of random forest algorithms to quantify feature importance in ultra-high temperature heat pump

Junzhuo WEI1(), Di WU1,2(), Ruzhu WANG1   

  1. 1.Institute of Refrigeration and Cryogenics, Shanghai Jiao Tong University, Shanghai 200240, China
    2.Shanghai Nuo Tong New Energy Technology Co. , Ltd. , Shanghai 200241, China
  • Received:2024-11-21 Revised:2024-12-26 Online:2025-06-26 Published:2025-06-25
  • Contact: Di WU

基于随机森林的超高温热泵系统特征重要性量化方法

危俊卓1(), 吴迪1,2(), 王如竹1   

  1. 1.上海交通大学制冷与低温工程研究所,上海 200240
    2.上海诺通新能源科技有限公司,上海 200241
  • 通讯作者: 吴迪
  • 作者简介:危俊卓(2002—),男,硕士研究生,wei_jz@sjtu.edu.cn
  • 基金资助:
    国家自然科学基金项目(52306019);上海市青年科技启明星计划(24QB2704400);中国博士后科技基金项目(BX2021175)

Abstract:

In the context of decarbonizing industrial heat demand, ultra-high temperature heat pumps, serving as active thermal energy recovery systems, are emerging as pivotal technologies for energy conservation and emission reduction. These systems convert low-grade waste heat into high-grade thermal energy with minimal electrical energy consumption. While augmenting the number of heat exchangers and compressors and refining their layout have proven beneficial in boosting system performance, they inevitably introduce complexity, posing additional hurdles for system analysis and optimization. To address this, feature importance-based variable selection techniques offer an effective means of reducing data dimensionality and swiftly pinpointing crucial system components. However, traditional correlation analysis methods frequently falter in producing consistent results when data is missing. To overcome this limitation, this study introduces a novel method for quantifying feature importance using the random forest model. Analytical results reveal that the random forest approach demonstrates superior generalization abilities when applied to 100 datasets containing missing data, achieving a variance in feature importance quantification of 0.11505, notably lower than the 0.17055 variance attained with the correlation coefficient method. Moreover, the results indicate that coupling temperature is the primary determinant affecting system performance, thus identifying a key area for further optimizing system design. Additionally, the study finds that the influence of output temperature on system efficiency is less than 5%, suggesting the system's low sensitivity to variations in output temperature and emphasizing its potential for ultra-high temperature applications.

Key words: ultra-high temperature heat pump, random forest, variable selection

摘要:

在工业热需求脱碳化的趋势下,超高温热泵系统正逐步成为节能降耗的核心技术。尽管通过增加换热器和压缩机的数量,并优化其布局,已证实能有效提升系统性能,但这同时也增加了系统变量的复杂性,给系统的分析与优化带来了新的难题。为了简化数据维度并迅速锁定系统中的关键部件,基于特征重要性的变量选择技术应运而生。然而,传统的相关性分析方法在处理数据缺失问题时,往往难以输出稳定可靠的结果。为了突破这一局限,基于随机森林模型采用了一种新的特征重要性量化方法。分析结果显示,在处理含有数据缺失的100组不同数据集时,随机森林方法表现出了更强的泛化性能,其量化的特征重要性方差为0.11505,相较于相关系数方法的方差0.17055有所降低。此外,系统特征重要性的量化分析进一步揭示,耦合温度是影响该系统性能的核心因素,这一发现为系统的后续优化设计指明了关键路径。

关键词: 超高温热泵, 随机森林, 变量选择

CLC Number: