34178

Transferring results from NIR-hyperspectral to NIR-multispectral imaging: A filter-based simulation applied to the classification of Arabica and Robusta green coffee beans

Favoritar este trabalho

Due to the big amount of data contained in each hyperspectral image and the resulting data handling issues, a feature selection step is necessary before implementing a system suitable for on-line process monitoring. Indeed, by means of adequate algorithms, hyperspectral imaging allows the identification of key wavelengths that can be further used for the development of cheaper and faster multispectral imaging systems for on-line applications or portable devices. However, adapting the outcomes of a variable selection procedure made on hyperspectral data to a filter-based or LED-based imaging system is not straightforward, since in these multispectral systems only average values measured on discrete spectral intervals are available. Thus, useful information related to spectral shape within the selected intervals is lost. On the other hand, the limited number of multispectral channels allows easily expanding the number of potentially useful descriptors, by calculating quantities derived from the outputs of the different channels.
In this context, the present work shows the feasibility of implementing a classification model for filter-based multispectral data, starting from the outcome of a sparse-based variable selection/classification model calculated on hyperspectral data.
In particular, 33 samples of Arabica and Robusta green coffee beans were used to build the classification models. For each sample 12 hyperspectral images were acquired using a line-scanning system in the 955-1700 nm NIR range with a spectral resolution of 5 nm.
Starting from the results of the variable selection/classification model calculated on hyperspectral data, 5 different commercial filters covering the selected regions (1050 nm, 1200 nm, 1250 nm, 1400 nm, 1450 nm) were considered. For each filter a reflectance value was obtained using the Gaussian-shaped profile of the filter, calculated from the filter properties (FWHM and percentage of transmission) provided by the commercial house. The reflectance value was calculated multiplying this profile by the reflectance values measured with the hyperspectral system within the corresponding spectral region and calculating the sum of the obtained values. Furthermore, additional descriptors derived from the five reflectance values were also considered, in order to evaluate both linear and non-linear relationships between the spectral channels.
The resulting datasets were classified using both PLS-DA and sparse PLS-DA, in order to select the most relevant variables. In particular, the classification models were built using a training set of average spectra, while the external validation was performed both at the image-level using a test set of average spectra, and at the pixel-level using a test image.
Concerning the image-level classification, the results obtained from the simulation were similar to those previously obtained from the full spectra or from sparse models, while at the pixel level the predictive ability of the simulated classification model was slightly lower and more sensitive to the shape of the beans.
The results confirm that the proposed approach allows assessing the actual potential of the on-line implementation of a multispectral imaging system, starting from the outcome of spectral variable selection made on hyperspectral data.