6. DATA ANALYSIS


FACTOR-ANALYTIC METHODS FOR PHYSICAL SCIENCES
Pentti Paatero, Philip K. Hopke* and Sirkka Juntto**

Matrix factorization methods for physical sciences ("Factor Analysis") are applicable to many problems where a number of "spectra" have been measured in similar situations or of similar samples consisting of same (perhaps unknown) constituents in different proportions. Examples: chromatographic "spectra", aerosol size distributions, compositions of environmental samples, MEG (magnetoencephalographic) measurements.

A newly developed method "PMF" or "Positive Matrix Factorization" is evaluated and applied in the present work. The essential features of PMF are:
  - utilization of error information of the measured data matrix
  - implementation of strict non-negativity constraints for the factor matrices
  - production of meaningful error estimates for the computed factors.
The method has been developed both for two-dimensional and for three-dimensional data arrays. The 3-way model is often called PARAFAC. The present 3-way solution is more efficent than the customary solutions of the PARAFAC problem and produces error estimates for the results.

In 1998, journal articles describing the application of the methods to various measurements of pollution in the Arctic air have been submitted for publication.

Generalization of the factor analytic methods has lead to a "multilinear" program which was completed in 1998. This table-driven program allows that different mathematical or physical models of data analysis may be formulated and computed by individual scientists by using the same program. The individual models are described by a large "structure table"; reprogramming of the fitting algorithm is not needed when a new model is to be solved.

* Clarkson Univ., NY, USA
** Finnish Meteorological Institute