A novel mega-trend-diffusion for small sample

doi:10.11949/j.issn.0438-1157.20151921

CIESC Journal ›› 2016, Vol. 67 ›› Issue (3): 820-826.DOI: 10.11949/j.issn.0438-1157.20151921

Previous Articles Next Articles

A novel mega-trend-diffusion for small sample

ZHU Bao¹, CHEN Zhongsheng², YU Le'an¹

1. School of Economics and Management Science, Beijing University of Chemical Technology, Beijing 100029, China;
2. College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China

Received:2015-12-17 Revised:2016-01-06 Online:2016-01-12 Published:2016-03-05
Contact: 67
Supported by:
supported by the National Natural Science Foundation of China(71433001).

一种新颖的小样本整体趋势扩散技术

朱宝¹, 陈忠圣², 余乐安¹

1. 北京化工大学经济管理学院, 北京 100029;
2. 北京化工大学信息科学与技术学院, 北京 100029

通讯作者: 余乐安
基金资助:
国家自然科学基金项目(71433001)。

Abstract

Abstract:

Process modeling, optimization and control methods based on data-driven attract attention to both academic community and business circles in terms of its research domains and applications. Even in Big Data era, small sample problems cannot be ignored. In view of the difficulty of obtaining high learning accuracy with small-sample-set using traditional modeling methods, such as artificial neural networks (ANNs), extreme learning machine (ELMs), etc., a novel technology of multi-distribution mega-trend-diffusion (MD-MTD) is proposed to improve the learning accuracy of small-sample-set. The mega-trend-diffusion (MTD) is employed to estimate the acceptable range of the attribution of small sample. The uniform distribution and triangular distribution are added based on MTD to describe data characteristics, which are used to generate virtual samples and fill information gaps among observations in small sample. A benchmarking function is utilized to generate benchmarking samples under the orthogonal test and inhomogeneous sample test in order to verify the reasonability and effectiveness of the MD-MTD, and two industrial real-world datasets include MLCC and PTA are used to further confirm the practicability of the MD-MTD. The results of the validation tests manifest that the proposed MD-MTD can improve the learning accuracy of more than 8% for small sample.

Key words: small-sample-set, mega-trend-diffusion, virtual sample, orthogonal test

摘要：

基于数据驱动的生产过程建模、优化与控制是当今学术界与企业界的研究与应用热点。大数据时代小样本问题不可忽视。针对诸如人工神经网络(ANNs)、极限学习机(ELMs)等传统建模方法在小样本条件下难以获得较高的学习精度,提出了一种新颖的多分布整体趋势扩散技术(multi-distribution mega-trend-diffusion, MD-MTD)用于提升小样本学习精度。通过整体扩散技术推估小样本属性可接受范围,在整体趋势扩散的基础上,增加了均匀分布和三角分布描述小样本数据特性,生成虚拟样本,填补小样本数据点间的信息间隔。利用标准函数产生标准样本,在正交实验和不均匀样本实验下论证了MD-MTD的合理性和有效性,用MLCC和PTA两个实际的工业数据集进一步验证了MD-MTD的实用性。实验结果表明,MD-MTD能提高小样本学习精度8%以上。

关键词: 小样本集, 整体趋势扩散技术, 虚拟样本, 正交实验

CLC Number:

TP181

ZHU Bao, CHEN Zhongsheng, YU Le'an. A novel mega-trend-diffusion for small sample[J]. CIESC Journal, 2016, 67(3): 820-826.

朱宝, 陈忠圣, 余乐安. 一种新颖的小样本整体趋势扩散技术[J]. 化工学报, 2016, 67(3): 820-826.

References 19

[1]	LIN Y S, LI D C. The generalized-trend-diffusion modeling algorithm for small data sets in the early stages of manufacturing systems[J]. European Journal of Operational Research, 2010, 207: 121-130.
[2]	YANG J, YU X, Xie Z Q, et al. A novel virtual sample generation method based on Gaussian distribution[J]. Knowledge-Based Systems, 2011, 24: 740-748.
[3]	Li D C, Wen I H. A genetic algorithm-based virtual sample generation technique to improve small data set learning[J]. Neurocomputing, 2014, 143: 222-230.
[4]	Li D C, Chang C J, Chen C C, et al. A grey-based fitting coefficient to build a hybrid forecasting model for small data sets[J]. Applied Mathematical Modelling, 2012, 36: 5101-5108.
[5]	Chang C J, Li D C, Huang Y H, et al. A novel gray forecasting model based on the box plot for small manufacturing data sets[J]. Applied Mathematics and Computation, 2015, 265: 400-408.
[6]	Poggio T, VETTER T. Recognition and structure from one 2D model view: observations on prototypes, object classes and symmetries[J]. Laboratory Massachusetts Institute of Technology, 1992, 1347: 1-25.
[7]	Li D C, Chen L S, Lin Y S. Using functional virtual population as assistance to learn scheduling knowledge in dynamic manufacturing environments[J]. International Journal of Production Research, 2003, 41: 4011-4024.
[8]	Li D C, Wu C S, Tsai T I, et al. Using mega-fuzzification and data trend estimation in small data set learning for early FMS scheduling knowledge[J]. Computers & Operations Research, 2006, 33(6): 1857-1869.
[9]	Li D C, Wu C S, Tsai T I, et al. Using mega-trend-diffusion and artificial samples in small data set learning for early flexible manufacturing system scheduling knowledge[J]. Computers & Operations Research, 2007, 34: 966-982.
[10]	Li D C, Chen C C, Chang C J, et al. A tree-based-trend-diffusion prediction procedure for small sample sets in the early stages of manufacturing systems[J]. Expert Systems with Applications, 2012, 39: 1575-1581.
[11]	Li D C, Hsu H C, Tsai T I, et al. A new method to help diagnose cancers for small sample size[J]. Expert Systems with Applications, 2007, 33: 420-424.
[12]	Chao G Y, Tsai T I, Lu T J, et al. A new approach to prediction of radiotherapy of bladder cancer cells in small dataset analysis[J]. Expert Systems with Applications, 2011, 38: 7963-7969.
[13]	Niyogi P, Girosi F, Poggio T. Incorporating prior information in machine learning by creating virtual examples[J]. Proc. IEEE, 1998, 86: 2196-2209.
[14]	Tsai T I, Li D C. Utilize bootstrap in small data set learning for pilot run modeling of manufacturing systems[J]. Expert Systems with Applications, 2008, 35: 1293-1300.
[15]	Huang C F. Principle of information diffusion[J]. Fuzzy Sets and Systems, 1997, 91: 69-90.
[16]	Huang C F, Moraga C. A diffusion-neural-network for learning from small samples[J]. International Journal of Approximate Reasoning, 2004, 35: 137-161.
[17]	Li D C, Lin L S, Peng L J. Improving learning accuracy by using synthetic samples for small datasets with non-linear attribute dependency[J]. Decision Support Systems, 2014, 59: 286-295.
[18]	周毅, 徐柏龄. 神经网络中的正交设计法研究[J]. 南京大学学报:自然科学版, 2001, 37(1): 72-78. Zhou Y, Xu B L. Orthogonal method for training neural networks[J]. Journal of Nanjing Forestry University: Natural Sciences Edition, 2001, 37(1): 72-78.
[19]	贺彦林, 王晓, 朱群雄. 基于主成分分析-改进的极限学习机方法的精对苯二甲酸醋酸含量软测量[J]. 控制理论与应用, 2015, 32(1): 80-85. DOI: 10.7641/CTA.2015.40398. He Y L, Wang X, Zhu Q X. Modeling of acetic acid content in purified terephthalic acid solvent column using principal component analysis based improved extreme learning machine[J]. Control Theory & Applications, 2015, 32(1): 80-85. DOI: 10.7641/CTA. 2015. 40398.

A novel mega-trend-diffusion for small sample

一种新颖的小样本整体趋势扩散技术

PDF (PC)

Knowledge

Cited

Abstract

Cite this article

share this article

References 19

Related Articles 13

Recommended Articles

Metrics

Comments

[1]	CHEN Zhongsheng, ZHU Meiyu, HE Yanlin, XU Yuan, ZHU Qunxiong. Quantile regression CGAN based virtual samples generation and its applications to process modeling [J]. CIESC Journal, 2021, 72(3): 1529-1538.
[2]	QIAO Junfei,GUO Zihao,TANG Jian. Virtual sample generation method based on improved megatrend diffusion and hidden layer interpolation with its application [J]. CIESC Journal, 2020, 71(12): 5681-5695.
[3]	CHEN Guoqi, SUN Jianjun, SUN Dianfeng, MA Chenbo. Performance analysis of double-end self-pumping mechanical seal for main coolant pump of sodium-cooled fast reactor [J]. CIESC Journal, 2018, 69(8): 3565-3576.
[4]	HUO Jiangbo, WANG Xiaoyi, ZHAO Jingqiang, REN Hairong. Application of air sample tubes of ionic liquid [J]. CIESC Journal, 2015, 66(S1): 326-331.
[5]	GU Dongsheng, SUN Jianjun, MA Chenbo, LU Jianhua. Orthogonal test of self-pumping mechanical seals based on numerical simulation [J]. CIESC Journal, 2015, 66(7): 2464-2473.
[6]	GAO Xueyi，WU Yanwei，WANG Kebing. Preparation of levulinic acid from hydrolysis of Salix psammophila catalyzed by acid and its separation and purification [J]. Chemical Industry and Engineering Progree, 2014, 33(01): 242-246.
[7]	LIU Pei, JIANG Jian, LIU Zongkuan, ZHANG Lei, HE Yanling. Iron extraction from pyrite cinder by mixed acid [J]. CIESC Journal, 2013, 64(7): 2619-2624.
[8]	HAN Hongjun, MOU Jinming, MA Wencheng, JIA Shengyong. Effect of microwave on cell wall broken of penicillin fermentation residue [J]. CIESC Journal, 2013, 64(10): 3812-3817.
[9]	WANG Zeyun，CUI Aijun，LU Weiliang，CHEN Qun，HE Mingyang. Optimization on the synthesis process of polymethyl glycolate via melt/solid polycondensation [J]. Chemical Industry and Engineering Progree, 2012, 31(12): 2771-2774.
[10]	ZHANG Deyu，LIU Weisheng. Osmium removal from ruthenium absorption liquid in rotating packed bed reactor via oxidation and vacuum degassing [J]. , 2010, 29(7): 1191-.
[11]	ZHU Lingzhi，HAN Enshan，CAO Jilin，CHENG Wenyu，WANG Chenxu. An improved solid-state reaction for optimized synthesis of LiFePO4/C cathode material [J]. , 2010, 29(11): 2108-.
[12]	HU Wenbin，CUI Yingde，LIAO Liewen，YIN Guoqiang，LI Fengyi . Cracking synthesis of methyl-vinyl-cyclosiloxanes by orthogonal test design [J]. , 2007, 26(6): 861-.
[13]	CHU Ruizhi，MENG Xianliang，ZHANG Bao. Crystallization process of cephradine [J]. , 2006, 25(10): 1211-.