Outlier identification in industrial processesa new method

  1. Manuel Castejón Limas
  2. Joaquín Bienvenido Ordieres Meré
  3. Ana González Marcos
  4. Alpha Verónica Pernía Espinoza
Book:
VIII Congreso Internacional de Ingeniería de Proyectos: Bilbao 6-8 de octubre de 2004. Actas

Publisher: Asociación Española de Ingeniería de Proyectos (AEIPRO)

ISBN: 84-95809-22-2

Year of publication: 2005

Congress: CIDIP. Congreso Internacional de Ingeniería de Proyectos (8. 2004. Bilbao)

Type: Conference paper

Abstract

The application of IT technologies in the industrial field has opened the door to the quality improvement by means of the use of modern analysis methods. The colossal amount of information available by virtue of the databases where process signals are recorded, afford us to break into the intrinsic structure of the process without the redounding quality shrink due to second order effects or linearity assumptions. Nevertheless, available data mining tools, are error-prone before databases with samples whose behavior is not akin to the observed patterns. Seems interesting, as a previous phase in modeling, to identify amongst those outlying samples that differ from the general behavior, from those useful in order to build a representative model. In this paper, we show the results obtained by our new algorithm, capable of identifying outliers in industrial data sets. The algorithm has been implemented in a free programming environment (R) and its results have proven useful not only in the improvement of the quality of the prediction models, but also in the determination of the origins of the outliers in the data. In the following pages, we show the existing relationships amongst the outlier identification techniques, the cluster and discriminant analysis algorithms, how the formers are indispensable for the planned pursuit, and how the latter provide a mean of interpretation for the obtained results