CIESC Journal ›› 2024, Vol. 75 ›› Issue (7): 2613-2623.DOI: 10.11949/0438-1157.20240322

• Process system engineering • Previous Articles     Next Articles

A semi-supervised soft sensor modeling method based on the Tri-training GPR

Junxia MA1,2(), Lintao LI2, Weili XIONG1,2()   

  1. 1.Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, Jiangsu, China
    2.School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, Jiangsu, China
  • Received:2024-03-20 Revised:2024-05-06 Online:2024-08-09 Published:2024-07-25
  • Contact: Weili XIONG

基于Tri-training GPR的半监督软测量建模方法

马君霞1,2(), 李林涛2, 熊伟丽1,2()   

  1. 1.江南大学轻工过程先进控制教育部重点实验室,江苏 无锡 214122
    2.江南大学物联网工程学院,江苏 无锡 214122
  • 通讯作者: 熊伟丽
  • 作者简介:马君霞(1984—),女,博士,副教授,jxma@jiangnan.edu.cn
  • 基金资助:
    国家自然科学基金项目(61803183)

Abstract:

Ensemble learning often achieves significantly superior generalization capabilities than a single learner by building and combining multiple learners. However, it is still a challenge to build a high-performance ensemble learning soft sensor model when the proportion of labeled data is small. In order to solve this problem, this paper proposes a soft-sensor modeling method based on the semi-supervised ensemble learning: Tri-training Gaussian process regression (Tri-training GPR) model. The modeling strategy gives full play to the advantages of semi-supervised learning, reducing the demand for labeled sample data in the modeling process. Under low data labeling rate, the labeled sample data set for modeling can still be expanded by filtering unlabeled data. Furthermore, a new idea of selecting high-confidence samples is proposed by combining the advantages of semi-supervised learning and ensemble learning. The proposed method was applied to the penicillin fermentation and debutanization tower process, and the soft sensor models for predicting penicillin and butane concentrations were established. Compared with the traditional modeling methods, the proposed method obtained better prediction results, which verified the effectiveness of the model.

Key words: soft senor, ensemble learning, semi-supervised learning, Tri-training, Gaussian process regression, process control, kinetic modeling, chemical processes

摘要:

集成学习因通过构建并结合多个学习器,常获得比单一学习器显著优越的泛化能力。但是在标记数据比例较少时,建立高性能的集成学习软测量模型依然是个挑战。针对这一个问题,提出一种基于半监督集成学习的软测量建模方法——Tri-training GPR 模型。该建模策略充分发挥了半监督学习的优势,减轻建模过程对标记样本数据的需求,在低数据标签率下,仍能通过对无标记数据进行筛选从而扩充可用于建模的有标记样本数据集,并进一步结合半监督学习和集成学习的优势,提出一种新的选择高置信度样本的思路。将所提方法应用于青霉素发酵和脱丁烷塔过程,建立青霉素和丁烷浓度预测软测量模型,与传统的建模方法相比获得了更优的预测结果,验证了模型的有效性。

关键词: 软测量, 集成学习, 半监督学习, Tri-training, 高斯过程回归, 过程控制, 动力学模型, 化学过程

CLC Number: