化工学报 ›› 2007, Vol. 58 ›› Issue (9): 2341-2346.

• 生物化学工程与技术 • 上一篇    下一篇

OSC-KPCA代谢物组学模式分析技术及应用

宋凯,李霞,元英进   

  1. 天津大学化工学院
  • 出版日期:2007-09-05 发布日期:2007-09-05

OSC-KPCA based metabolomics pattern analysis for Arabidopsis thaliana genotype discrimination

SONG Kai,LI Xia,YUAN YingjinSong Kai   

  • Online:2007-09-05 Published:2007-09-05

摘要: 利用核主成分分析(Kernel Principle Components Analysis KPCA)强大的非线性识别能力及其在高维小样本数据处理方面的独特优势对模式植物—拟南芥的四种基因型(两种基因型Co10和C24及其杂交子代Co10×C24和C24×Co10)样本进行模式分析研究。结果表明利用正交信号校正(Orthogonal Signal Correction OSC)技术对原始数据进行滤波处理后,基于Sigmoid核函数的KPCA方法对这四种基因型样本的正确分类和预测能力均达到100%。有力地证明了OSC-KPCA方法可以有效地提取代谢物组信息,揭示生物系统内代谢表型与不同基因型之间的内在联系,为代谢物组学及系统生物学的进一步研究奠定基础。

关键词:

核主成分分析, 正交信号校正, 代谢物组学, 拟南芥

Abstract:

A novel OSC-KPCA based pattern analysis method was proposed to improve the clustering and predictive performance of the metabolomics.The strong nonlinear pattern recognition power and the predominance in dealing with small high-dimensional data of the Kernel Principal Component Analysis(KPCA)were used here to analyze four genotypes of the important model plant—Arabidopsis thaliana.In order to improve the performance of PR(Pattern Recognition),the Orthogonal Signal Correction(OSC)method was used to filter the original data firstly to eliminate the interference of noise.The PR results showed that the OSC-KPCA based PR method could reveal the underlying relationship between genotypes and metabolites successfully.The paternal genotypes(Co10 and C24)and the two F1 progeny C24×Co10 and Co10×C24 could be 100% correctly discriminated.More importantly the predictability was also as high as 100%.

Key words:

核主成分分析, 正交信号校正, 代谢物组学, 拟南芥