Multiblock Data Fusion in Statistics and Machine Learning. Tormod Næs

Читать онлайн.
Название Multiblock Data Fusion in Statistics and Machine Learning
Автор произведения Tormod Næs
Жанр Химия
Серия
Издательство Химия
Год выпуска 0
isbn 9781119600992



Скачать книгу

      1.4.4 Chemistry

      ELABORATION 1.6

      Terms in chemistry

      Multivariate curve resolution:Part of chemometrics that tries to mathematically resolve mixtures of chemicals into their individual compounds.Multivariate calibration:Part of chemometrics that deals with predicting properties (e.g., concentrations) from spectroscopic measurement. The idea is to replace a slow, expensive measurement technique (the reference method) by a fast, cheaper, and often non-destructive one (a spectroscopic measurement).Process chemometrics:Part of chemometrics devoted to processes; such as process analysis, multivariate process control and process monitoring.Vibrational spectroscopy:Chemical measurement techniques that probe vibrational energies of molecules. There are different types of vibrational spectroscopy: infrared (IR), mid-infrared (MIR), near-infrared (NIR), ultraviolet (UV), visible (VIS) and Raman spectroscopy.

      Another area in chemistry which is populated with multiblock data analysis problems is process chemometrics (MacGregor et al., 1994; Wise and Gallagher, 1996; Kourti et al., 1995; Lopes et al., 2002). The general problem is how to combine multiple chemical process measurements for process understanding and statistical process monitoring.

      Example 1.3: Chemistry example: Raman spectroscopy data

      Figure 1.6 Plot of the Raman spectra used in predicting the fat content. The dashed lines show the split of the data set into multiple blocks.

      In this book, we will concentrate on the Raman block as this dominated completely in a previous multiblock data analysis study (Liland et al., 2016), and rather split it into suitable wavelength regions, here splitting at 1350 cm −1 and 1100 cm −1. This is done to explore the predictive power of the different wavelength regions. This data set will be analysed using several of the supervised methods in this book to see what is emphasised by each of them. In general, we see that the predictive models mostly leverage the variables corresponding to molecular vibrations associated with lipids and degrees of saturation, and that these models can reproduce the reference values with high precision.

      1.4.5 Sensory Science

      ELABORATION 1.7

      Terms in sensory analysis

      Consumer liking:For hedonic sensory methods, a consumer panel is used. The consumer can be asked about how much they like the different products and how willing they are to buy the products tested.Sensory panel:For assessing product quality, it is common to use a sensory panel consisting of a number of trained assessors which assess the intensity on a predefined scale of a number of relevant sensory attributes.Sensory attribute:The measurements, as performed by the sensory panel, such as sweetness, hardness, and acidity (depending on types of products).Rapid sensory methods:There exist a number of so-called rapid sensory methods, for instance, projective mapping, sorting, and CATA. For the latter all participants are asked to tick, for each product, on the relevant attributes on a predefined list. This gives a table of 0s and 1s for each participant.

      Example 1.4: Sensory example: consumer liking

      1.5 Goals of Analyses

      Many goals of multiblock data analysis can be envisaged. In current practice, these goals are usually implicit. By making these goals explicit it will become necessary to also make explicit the global optimisation criterion or, when such a criterion is difficult to formulate, to carefully think about the whole data analysis procedure and which method to choose. Several general goals will be discussed briefly.